Data Mining Implementation Using Naïve Bayes Algorithm and Decision Tree J48 In Determining Concentration Selection

Budiman Budiman, Reni Nursyanti, R Yadi Rakhman Alamsyah, Imannudin Akbar

Abstract


Computerization of society has substantially improved the ability to generate and collect data from a variety of sources. A large amount of data has flooded almost every aspect of people's lives. AMIK HASS Bandung has an Informatic Management Study Program consisting of three areas of concentration that can be selected by students in the fourth semester including Computerized Accounting, Computer Administration, and Multimedia. The determination of concentration selection should be precise based on past data, so the academic section must have a pattern or rule to predict concentration selection. In this work, the data mining techniques were using Naive Bayes and Decision Tree J48 using WEKA tools. The data set used in this study was 111 with a split test percentage mode of 75% used as training data as the model formation and 25% as test data to be tested against both models that had been established. The highest accuracy result obtained on Naive Bayes which is obtaining a 71.4% score consisting of 20 instances that were properly clarified from 28 training data. While Decision Tree J48 has a lower accuracy of 64.3% consisting of 18 instances that are properly clarified from 28 training data. In Decision Tree J48 there are 4 patterns or rules formed to determine concentration selection so that the academic section can assist students in determining concentration selection.


Keywords


Concentration, Classification, Naïve Bayes, Decision Tree J48

Full Text:

PDF

References


Al-Radaideh, Q. A., Al-Shawakfa, E. M., and Al-Najjar, M. I. (2006, December). Mining student data using decision trees. In International Arab Conference on Information Technology (ACIT'2006), Yarmouk University, Jordan.

Amra, I. A. A., and Maghari, A. Y. (2017, May). Students performance prediction using KNN and Naïve Bayesian. In 2017 8th International Conference on Information Technology (ICIT) (pp. 909-913). IEEE.

Beikzadeh, M., & Delavari, N. (2005). A new analysis model for data mining processes in higher educational systems. On the proceedings of the 6th Information Technology Based Higher Education and Training, 7-9.

Devasia, T., Vinushree, T. P., and Hegde, V. (2016, March). Prediction of students performance using Educational Data Mining. In 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE) (pp. 91-95). IEEE.

Dimitoglou, G., Adams, J. A., and Jim, C. M. (2012). Comparison of the C4. 5 and a Naïve Bayes classifier for the prediction of lung cancer survivability. arXiv preprint, 4(8), 1-9.

Han, J., Kamber, M., and Pei, J. (2012). Data mining: concepts and techniques. Morgan Kaufman Publishers, 10, 978-981.

Larose, D. T. (2015). Data mining and predictive analytics. New York: John Wiley & Sons.

Mayilvaganan, M., and Kalpanadevi, D. (2014, December). Comparison of classification techniques for predicting the performance of student academic environment. In 2014 International Conference on Communication and Network Technologies (pp. 113-118). IEEE.

Merceron, A., & Yacef, K. (2005, May). Educational Data Mining: a Case Study. In AIED (pp. 467-474).

Minaei-Bidgoli, B., Kashy, D. A., Kortemeyer, G., & Punch, W. F. (2003, November). Predicting student performance: an application of data mining methods with an educational web-based system. In 33rd Annual Frontiers in Education, 2003. FIE 2003. (Vol. 1, pp. T2A-13). IEEE.

Nematzadeh, B. Z. (2012). Comparison Of Decision Tree And Naive Bayes Methods In Classification Of Researcher's Cognitive Styles, Academic Environment, 3(2), 23-34.

Rao, K. P., Rao, M. C., and Ramesh, B. (2016). Predicting learning behavior of students using classification techniques. International Journal of Computer Applications, 139(7), 15-19.

Romero, C., Ventura, S., and García, E. (2008). Data mining in course management systems: Moodle case study and tutorial. Computers & Education, 51(1), 368-384.

Saritas, M. M., and Yasar, A. (2019). Performance analysis of ANN and Naive Bayes classification algorithm for data classification. International Journal of Intelligent Systems and Applications in Engineering, 7(2), 88-91.

Waiyamai, K. (2003). Improving quality of graduate students by data mining. Department of Computer Engineering, Faculty of Engineering, Kasetsart University, Bangkok.

WEKA. [Online]. [cited 2020 August 14. Available from: https://waikato.github.io/weka-wiki/not_so_faq/j48_numbers/.

Ye, N. (2013). Data mining: theories, algorithms, and examples. New York: CRC press.




DOI: https://doi.org/10.46336/ijqrm.v1i3.72

Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 International Journal of Quantitative Research and Modeling

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Published By: 

IJQRM: Jalan Riung Ampuh No. 3, Riung Bandung, Kota Bandung 40295, Jawa Barat, Indonesia

 

IJQRM Indexed By: 

width= width= width= width= width= width=

 


Lisensi Creative Commons Creation is distributed below Lisensi Creative Commons Atribusi 4.0 Internasional.


View My Stats