Comparative Study of K-means, Expectation-Maximization and  Density Based Clustering Algorithm

Upadhaya, Deepa

Comparative Study of K-means, Expectation-Maximization and Density Based Clustering Algorithm

Files

final one.pdf (1.88 MB)

Date

2018

Authors

Upadhaya, Deepa

Publisher

Department of Computer Science and Information Technology

Abstract

Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. This dissertation entitled ―”Comparative Study of K-means, Expectation-maximization and density Based Clustering Algorithm” is one of the implementation of Data Mining in which the datasets of “Heart Disease and Thyroid Disease Data Set” are used. There is a wide range of algorithms available for clustering. This research presents a comparative study of clustering algorithms. In experiments, the accuracy and time taken by algorithms is evaluated by comparing the results on heart disease and thyroid disease datasets , which is obtained from the UCI and KEEL repository using WEKA tool. All total 597 data of heart disease datasets and 3772 data of Thyroid disease datasets are use for implementing the algorithm. Heart disease use 14 attributes and thyroid disease use 30 attributes. Expectation-maximization clustering and Density based clustering takes more time to form clusters for both datasets (heart disease and thyroid disease datasets).Simple K-means clustering algorithms forms clusters with less time and more accuracy than other algorithms for heart disease and thyroid disease datasets . In terms of time and accuracy K-means produces better results as compared to other algorithms.