A Comparative Analysis of Apriori and Clustering Based Apriori Algorithm
Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Information Technology
Abstract
Freqent itemset is the itemset that occurs frequently in a given set of data items. Nowadays,
frequent itemset is most popular in developing different marketing strategy. The size of data
increases rapidly and to cope with that data a new method is needed that is capable of
handling large volume of data. For that purpose, a hybrid clustering based apriori algorithm is
used for generating frequent itemset.
In this research, the comparison of two different frequent itemset generation algorithms
(Apriori and Clustering based Apriori) is presented. The main aim of this research is to
evaluate the performance of those algorithms based on the parameters like: total number of
frequent itemset generated, effect of support percentage on itemset generation and effect of
clustering on itemset generation for different dataset with different dimensions. The dataset
for this research are chosen such that they are different in size, mainly in terms of number of
attributes and number of instances. When comparing the performance it is found that: the
clustering based apriori algorithm generates more frequent itemset than the apriori algorithm.
In general, by increasing the support percentage both algorithms produces less number of
frequent itemset. When the clustering number is balanced then the number of frequent itemset
generated is small.
Keywords:
Frequent itemset, Apriori, Clusering based Apriori, Association Rule Mining and K-Means.
Description
Keywords
Frequent itemset, Apriori,, Association rule mining and K-Means, Clusering based apriori