Fuzzy Decision Tree to Predict Student Success in Their Studies

The number of students graduating on time is one of the important aspects in the assessment of accreditation of a university. But the problem is still a lot of students who exceed the target time of graduation. Therefore, the prediction of graduation on time can serve as an early warning for the university management to prepare strategies related to the prevention of cases of drop out. The purpose of this research is to build a model using fuzzy decision tree to form the classification rules are used to predict the success of a student's study using fuzzy inference system. Results of this study was generated model of the number of classification rules are 28 rules when the value θr is 98% and θn is 3%, with the level of accuracy is 95.85%. Accuracy of Fuzzy ID3 algorithm is higher than ID3 algorithms in predicting the timely graduation of students.


Introduction
The quality of a university than can be seen by the average length of its graduates get a job, can also be seen by the average length of study students. The study program is obliged to monitor the progress of students study. Prediction graduate on time can serve as an early warning to the performance of students study. Furthermore, the overall prediction can be used as a reference in evaluating the educational process, curriculum, and other matters relating to education. To make a prediction can be done in various ways, one of which can be done by using a data mining techniques. Kwik Kian Gie School of Business has a dataset at the Academic Information System that has not been fully utilized. It is unfortunate if the dataset is so large is not used for mining the information contained therein.
Classification is one of the methods in data mining to determine the class of a record label in the data (Han and Kamber, 2006). Classification techniques are the focus of this research is a decision tree. In the decision tree method, if the attributes used continuous type, then it should be done discretization to divide the range of values on the attributes using the point of intersection, where the cut points used will determine the value of a domain with clear boundaries that may occur misclassification. Determination of the cutoff point is crucial. The concept of fuzzy logic is an alternative to state something that cannot be defined precisely. In the fuzzy set, the role of the degree of membership as a determinant of the presence of elements in a set is very important.
Some research in the field of university academic has been done by using decision tree classification techniques. Some of research on the classification and evaluation of student performance using decision tree algorithm C4.5 and Naive Bayesian (Vasani and Gawali, 2014). Applying algorithms ID3 and C4.5 to predict the performance of students in the first semester (Adhatrao et al., 2013). Applying decision tree algorithm ID3, C4.5 and CART to predict the performance of students majoring in engineering . Predict the performance of students Purvachal University in India using decision tree algorithm ID3, C4.5, CART . Fuzzy inference systems to predict student learning achievement based on the national exam, a test of academic potential, and learning motivation (Mustafidah and Aryanto, 2012). e-ISSN 2721-477X p-ISSN 2722 Some research related to the application of the technique of fuzzy decision tree is the research on fuzzy decision tree for software effort estimation (Idri and Elyassami, 2011). Fuzzy decision tree for evaluation of employee performance (Li et al., 2012). Fuzzy ID3 Algorithm for Effective Prediction of Bankruptcy based on the Qualitative factors (Martin et al., 2012). Fuzzy decision tree using the data iris (Yun et al., 2014).
Based on the research that has been done using decision tree classification techniques, this research will be done graduation data classification using decision tree to predict the timely graduation of students by using fuzzy inference system (Mustafidah and Aryanto, 2012). However, in studies using fuzzy inference system still uses its own set of rules established or not established based on the data so that it can potentially cause the resulting level of accuracy is not the maximum. So in this research will build a model student graduation data classification using fuzzy decision tree method. Results of the rules of fuzzy decision tree classification is then used in the fuzzy inference system as a classification rule base that will be used to predict the timely graduation of students. Results from this research can be used by the university management, to provide treatment and warning against students who are expected to graduate is not timely.

Materials
Data obtained from the graduation data of Kwik Kian Gie School of Business in 2015, majoring in accounting and management from 2008 to 2010. The data is a combination of several tables, which comes from the dataset student profiles, student transcripts, and the presence of students. The data used is the data that is already clean or does not contain the missing value, relevant and not redundant. The result of the merger of the three datasets, generating 410 records and 6 attributes relevant to the graduation on time. These attributes include GPA (Grade Point Average) 1st semester, 2nd semester GPA, values of pancasila, the value of citizenship, attendance, and the average value of two high school classes. Attribute values of pancasila and citizenship values will be taken the average of both values and serve as a new attribute with the name attribute behavior. As for the attributes of attendance will represent the discipline of a student then this attribute will be named as attributes discipline. So that the final result obtained five attributes among 1st semester GPA, 2nd semester GPA, Discipline, Behavior, and Raport. As for the target class is categorized into two categories: passed on time (≤ 48 months of the study period) and category not passed on time (study period> 48 months).

Methods
Framework in this study can be described in a flow chart as can be seen in Figure 1.

. Fuzzy ID3 Algorithm
Fuzzy ID3 algorithm is an efficient algorithm to create a fuzzy decision tree. The algorithm of fuzzy ID3 is as follows (Liang, 2005;Mujiarto et al., 2019): a. Create a Root node that has a set of fuzzy data with membership value 1. b. If a node t with a fuzzy set of data D satisfies the following conditions, then it is a leaf node and assigned by the class name.  The proportion of class C k is greater than or equal to θ r , | | ≥ (1)  The number of a data set is less than θ n  There are no attributes for more classifications c. If a node D does no satisfy the above conditions, then it is not a leaf-node. And an new sub-node is generated as follow: ,L) calculate the information gain, and select the test attribute A max that maximizes them.  Devide D into fuzzy subset D 1 ,…,D m according to A max , where the membership value of the data in D j is the product of the membership value in D and the value of F max,j of the value of A max in D.  Generate new node t i ,…,t m for fuzzy subsets D 1 ,…,D m and label the fuzzy sets F max,j to edges that connect between the nodes t j and t.  Replace D by D j (j=1,2,…,m) and repeat from 2 recursively.

Fuzzy Entropy and Information Gain
Information gain is a statistical value that is used to select attributes that will expand the tree and produce a new node on ID3 algorithm. An entropy is used to define the value of information gain. Entropy is defined as follows: With is the ratio of class on the set of examples S = { 1 , 2 ,…, }.
There are two special cases that occur in the classification boolean, the first is if all the members of the set S has the same type, then the value of entropy is 0 (zero). This means it does not happen uncertainties classification.
To expand the attributes, which are based on data from the set of instances, must first be defined standard measure information gain. Information gain is used as a measure of selection of attributes, which is the result of a reduction in entropy of the set of sample after dividing the sample size set by the number of attributes. Information gain for attributes A is defined as follows: is the ratio of the data on a set of v attributes example. In the fuzzy data set, there is an adjustment formula to calculate the entropy values for attributes and information gain due to the data expression fuzzy. Here is the equation to find the value of fuzzy entropy of all the data: To determine the fuzzy entropy and information gain of an attribute A on fuzzy ID3 algorithm (FID3) used the following equation: With is the value of membership of a pattern to-j for the i-th class. ( ) shows the entropy of the set S of training data on the node. | | is the size of the subset ⊆ S of training data with attributes v. |S| indicates the size of the set S (Liang, 2005).

Fuzziness Control Threshold and Leaf Decision Threshold
Determine fuzziness control threshold (θr) and leaf decision threshold (θn) to get the best model and with high accuracy (> 80%) for predicting student graduation on time. For example, if the threshold value used for θr θn 80% and 20%, but still produces a low model accuracy or less than 80% then conducted another experiment combination threshold to achieve higher accuracy.
If the learning process of a fuzzy decision tree is stopped until all the data sample on each leaf-node becomes a member of a class it will produce low accuracy. Therefore, to improve accuracy, the process of learning to be terminated early or tree pruning in general (Liang, 2005). As a result, two thresholds are defined (Umano et al., 1994).
 Fuzziness control threshold (FCT) / θr If the proportion of the data set of class Ck is greater than or equal to the threshold value θr, the expansion of tree stopped.

 Leaf decision threshold (LDT) / θn
If the number of members of the data set at a node is smaller than the threshold θn, then the tree expansion is stopped.

ID3 Algorithm
ID3 algorithm is a decision-making algorithm most widely used because it is easy to use and effectiveness (Liang, 2005). ID3 algorithm developed by Ross Quinlan (Quinlan, 1986). ID3 algorithm will do the greedy search for all possible decision tree. To determine the root node and other attributes, ID3 calculating entropy values to get the information gain. The more diverse set of sample data, the greater the entropy value.
Entropy values were in the range of 0 to 1. The formula calculating the entropy values are as follows: Where H(S) is the entropy value of the data sample S. N is the number of classes on the attributes, while P i is the number of samples for class i or the ratio of class. After searching the value of entropy, then the next can be calculated the value of information gain. Information gain is a measure of the effectiveness of an attribute or parameter to classify data.
To calculate it can be used the following formula:

Attribute Correlation Analysis
Based on correlation test was done to all the attributes using the Pearson product moment correlation results obtained index value of each attribute is summarized in Table 1. As seen in Table 1, all attributes have a negative correlation to graduate on time. 2nd semester GPA attribute has the highest correlation with coefficient -0.74 (74%). Negative correlation has a meaning that the higher the value of an attribute predictor of the long study period covered will be the smaller or faster. These attributes are then used in this study.

Fuzzyfication Data
In this research used data mining techniques are fuzzy decision tree, therefore the data used must be represented in the form of fuzzy. The process begins with a membership function of each attribute is used.

Discipline attributes
Discipline attributes are divided into 3 groups or linguistic terms, ie less (x <= 180 present), medium (175 present <= x <= 190 present), and good (185 present <= x <= 196 present). Fuzzy set for each linguistic term use curves with a trapezoidal shape as shown in Figure 5.

Attribute of Target Class
Attributes graduate on time will be referred to as Class, represented by two linguistic variables are "graduate on time" and "not graduating on time". Both linguistic terms are defined as follows: a) graduate on time = 1 (≤ 48 months of the study period), b) not graduating on time = 2 (study period> 48 months).

Results of Formation Model of Fuzzy ID3
Data from the transformation that has divided using 10-fold cross validation, given the same treatment by Fuzzy ID3 algorithm to perform the training process. The training process is done 240 times. For each training set, the training process is done 24 times, by changing the value θr 6 times, namely 75%, 80%, 85%, 90%, 95%, and 98%, and for each of the same value θr, given the value θn different, namely 3%, 5%, 8%, and 10%. The average number of rules generated by the training process and the total execution time can be seen in Table 2 and Table 3.  The values of θr and θn are used in Table 2 are selected based on the results of the experiment, because with these values, the number of rules generated significant changes. In Table 2 and Table 3 shows that the most significant improvement occurred when the value of θr is increased from 90% to 95%. This condition is caused due to expansion during the first training set is done, many sub-node that the proportion of one class has reached a value above 90%, so that the sub-node does not need to be expanded further. In Table 2 shows that the higher the value of θr it will cause the number of rules generated is also rising, this is because before a node is dominated by a class and the proportion of the class above or equal to the value of θr then the tree will continue to be expanded. The opposite occurs if the value of θn the higher the resulting rules tends to decrease.  Vol. 1, No. 3, pp. 135-144, 2020 In Figure 7 shows that the higher value of θr so that the longer the execution time to form a decision tree. It is inversely linearly with the number of rules that are formed as shown in Figure 6, that the more the number of rules that forms the execution time to form a decision tree is also increasing.

Testing Model of Fuzzy ID3
To measure the accuracy of the model generated in the training phase, the process of testing carried out 240 times of testing. Testing process is done by inserting a rule which is derived from the training process into a FIS Mamdani to determine the class of each record in the testing set. For one training process done once the testing process.

Performance Evaluation of Fuzzy ID3
Performance evaluation of Fuzzy ID3 algorithm can be determined by calculating the average accuracy of the entire process of testing at 10 different sets of testing. Evaluation of the performance of Fuzzy ID3 on the value θr and θn different, can be seen in Table 4 and Figure 4.  Table 4 and Figure 8 it can be seen that the performance of Fuzzy ID3 increased if the value θr greater or smaller value θn. By looking at the average value of accuracy of testing results, it appears that the level of accuracy is decreased when the value θr increased from 75% to 80% and accuracy increased significantly when increased from 90% to 95%. The average level of accuracy at the time of the value θn is 8% and 10% have the same accuracy rate changes. The highest accuracy value produced during θr value is 95% and 98% by value θn is 3% which is the average value of accuracy was 95.85%.

Performance Comparison of Fuzzy ID3 and ID3
At this stage, the analysis of testing of fuzzy decision tree algorithm using fuzzy ID3 algorithm and decision tree using ID3 algorithm. Data used in the ID3 algorithm is data graduation same with fuzzy ID3 algorithm that 410 data records. Training method used is the same that is using 10-fold cross validation. Before the training, these data should be transformed into several categories. The number of categories and range of data on these attributes equated with the transformation of fuzzy ID3. Detail data transformation can be seen in Table 5. This is done to determine whether using a fuzzy decision tree in the form of a decision tree models would be better if compared with the decision tree method. Table 6 below are the results of performance comparison between fuzzy ID3 and ID3.  In Figure 9 and Figure 10 can be seen that the number of rules generated by the fuzzy ID3 algorithm less when compared with the ID3 algorithm. But fuzzy ID3 algorithm has a higher accuracy of the ID3 algorithm. From the results of this experiment, it can be concluded that the model generated by an algorithm that uses fuzzy approach is better when compared to the decision tree algorithm which does not use fuzzy approach in terms of classification and prediction study period students.

Conclussion
This research has successfully developed a model based on the data graduation by using fuzzy ID3 algorithm to establish the rules of classification that is used to predict the timely graduation of students using FIS Mamdani. The best number of classification rules are generated an average of 25 rules when the value of θr is 98% and θn is 3%, with the level of accuracy is 95.85%. The higher the value of θr and the lower the value of θn then the accuracy will be higher. Based on the classification rules are established, the most decisive factor one can graduate on time is 2nd Semester GPA. The model produced by Fuzzy ID3 algorithm has better accuracy rate than the ID3 algorithm. In addition, this study also produce a simple application that can be used to create a classifier based on fuzzy ID3 algorithm using Matlab. In the next study the formation process decision tree can use the genetic algorithm to optimize fuzzy decision tree (FDT) in order to obtain genetically optimized fuzzy decision tree (G-FDT) in predicting timely graduation of students in an effort to improve the accuracy of the model is obtained.