A Comparative Analysis of Apriori and Clustering Based Apriori Algorithm

Date
2019
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Information Technology
Abstract
Freqent itemset is the itemset that occurs frequently in a given set of data items. Nowadays, frequent itemset is most popular in developing different marketing strategy. The size of data increases rapidly and to cope with that data a new method is needed that is capable of handling large volume of data. For that purpose, a hybrid clustering based apriori algorithm is used for generating frequent itemset. In this research, the comparison of two different frequent itemset generation algorithms (Apriori and Clustering based Apriori) is presented. The main aim of this research is to evaluate the performance of those algorithms based on the parameters like: total number of frequent itemset generated, effect of support percentage on itemset generation and effect of clustering on itemset generation for different dataset with different dimensions. The dataset for this research are chosen such that they are different in size, mainly in terms of number of attributes and number of instances. When comparing the performance it is found that: the clustering based apriori algorithm generates more frequent itemset than the apriori algorithm. In general, by increasing the support percentage both algorithms produces less number of frequent itemset. When the clustering number is balanced then the number of frequent itemset generated is small. Keywords: Frequent itemset, Apriori, Clusering based Apriori, Association Rule Mining and K-Means.
Description
Keywords
Frequent itemset, Apriori,, Association rule mining and K-Means, Clusering based apriori
Citation