Comparative Study of K-means, Expectation-Maximization and Density Based Clustering Algorithm
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Information Technology
Abstract
Data mining is the process of analyzing data from different perspectives and summarizing it
into useful information. This dissertation entitled ―”Comparative Study of K-means,
Expectation-maximization and density Based Clustering Algorithm” is one of the
implementation of Data Mining in which the datasets of “Heart Disease and Thyroid Disease
Data Set” are used. There is a wide range of algorithms available for clustering. This research
presents a comparative study of clustering algorithms. In experiments, the accuracy and time
taken by algorithms is evaluated by comparing the results on heart disease and thyroid
disease datasets , which is obtained from the UCI and KEEL repository using WEKA tool.
All total 597 data of heart disease datasets and 3772 data of Thyroid disease datasets are use
for implementing the algorithm. Heart disease use 14 attributes and thyroid disease use 30
attributes.
Expectation-maximization clustering and Density based clustering takes more time to form
clusters for both datasets (heart disease and thyroid disease datasets).Simple K-means
clustering algorithms forms clusters with less time and more accuracy than other algorithms
for heart disease and thyroid disease datasets . In terms of time and accuracy K-means
produces better results as compared to other algorithms.
