Analysis of MST based clustering algorithm with different threshold values
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Information Technology
Abstract
Clustering analysis has been an emerging research issue in data mining due to its variety of
applications. Many algorithms are proposed so far, however each algorithm has been its own merits
and demerits and cannot work for real situation. The MST based clustering algorithms have been
widely used due to their ability to detect cluster with irregular boundaries. In this dissertation the
clustering algorithm is inspired by MST.
In this dissertation the MST based clustering algorithm has been analyzed using different threshold
value on MST and measured by validity index. Given the MST over data set, select or reject the edges
of MST in process of forming the clusters, depending on the threshold value. Validity index is the ratio
of intra cluster distance and inter cluster distance. Thresholds are taken by mean, standard deviation
and mean + standard deviation of MST. These thresholds are evaluated by validity index. Smallest
value of validity index is select for best clustering and best threshold value. The algorithm has been
tested on the randomly generated data sets and as well as real world data sets.
Keywords: Clustering Algorithm, MST, Validity Index, Threshold Values