Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/10185
Full metadata record
DC FieldValueLanguage
dc.contributor.authorYogi, Ganesh-
dc.date.accessioned2022-05-08T09:42:21Z-
dc.date.available2022-05-08T09:42:21Z-
dc.date.issued2018-
dc.identifier.urihttps://elibrary.tucl.edu.np/handle/123456789/10185-
dc.description.abstractDecision tree learning algorithm has been successfully used in expert systems in capturing knowledge. The main task performed in these systems is using inductive methods to the given values of attributes of an unknown object to determine appropriate classification according to decision tree rules. It is one of the most effective forms to represent and evaluate the performance of algorithms, due to its various eye catching features: simplicity, comprehensibility, no parameters, and being able to handle mixed-type data. There are many decision tree algorithm available named ID3, C4.5, CART, CHAID, QUEST, GUIDE, CRUISE, and CTREE. In this paper, I have used attribute Selection Methods: ID3, C4.5 and CART, and meteorological data collected between 2004 and 2008 from the city of Kathmandu, Nepal, for Decision Tree algorithm. A data model for the meteorological data was developed and this was used to train the Decision Tree with these different attribute selection methods. The performances of these methods were compared using standard performance metrics. Cross fold validation is performed to test the built model i.e. Decision Tree. 10-fold cross validation is performed which partitions the dataset into 10 partitions and uses 90% data as training and 10% as testing. This testing is performed for ten repetitions. Experimentation results show, CART Decision tree has slightly more accuracy with large volume of dataset than that of other algorithms ID3 and C4.5. From the view of speed, C4.5 is better than other two algorithms. CART Decision tree has the average system accuracy rate of 80.9315%, system error rate of 19.0685%, precision rate of 83.1%, and recall rate of 83.1%. Similarly, C4.5 Decision Tree has the average system accuracy rate of 80.6849%, system error rate of 19.3151%, and precision rate of 82% recall rate of 84.4%. And ID3 Decision Tree has the average system accuracy rate of 28.08%, system error rate of 4.08%, and precision rate of 89.4% recall rate of 91.3%. From the time to complete perspective C4.5 completes in 0.05 seconds, ID3 completes in 0.32 seconds where as CART completes in 251.82 seconds. Keywords: Data Mining, Classification, Classifier, ID3, C4.5, CART, Supervised Learning, Unsupervised Learning, Decision Tree, Information Gain, Gain Ratio, Gini Index.en_US
dc.language.isoen_USen_US
dc.publisherDepartment of Computer Science & Information Technologyen_US
dc.subjectData Miningen_US
dc.subjectClassificationen_US
dc.subjectSupervised Learningen_US
dc.subjectDecision Treeen_US
dc.titlePerformance Analysis of Attribute Selection Methods in Decision Tree Inductionen_US
dc.typeThesisen_US
local.institute.titleCentral Department of Computer Science and Information Technologyen_US
local.academic.levelMastersen_US
Appears in Collections:Computer Science & Information Technology

Files in This Item:
File Description SizeFormat 
All thesis.pdf1.61 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.