Volume 11 - Volume 11
A Comprehensive Study on the Importance of the Elbow and the Silhouette Metrics in Cluster Count Prediction for Partition Cluster Models
Abstract
Proper selection of cluster count gives better clustering results in partition models. Partition
clustering methods are very simple as well as efficient. Kmeans and its modified versions are very
efficient cluster models and the results are very sensitive to the chosen K value. The partition
clustering algorithms are more suitable in applications where the data are arranged in a uniform
manner. This work aims to evaluate the importance of assigning cluster count value in order to
improve the efficiency of partition clustering algorithms using two well known statistical methods, the
Elbow method and the Silhouette method. The performance of the Silhouette method and Elbow
method are compared with different data sets from the UCI data repository. The values obtained
using these methods are compared with the results of cluster performance obtained using the
statistical analysis tool Weka on the selected data sets. Performance was evaluated on cluster
efficiency for small and large data sets by varying the cluster count values. Similar results obtained
from the three methods, the Elbow method, the Silhouette method and the clustering by Weka. It was
also observed that the fast reduction in clustering efficiency for small changes in cluster count when
the cluster count is small.
Paper Details
PaperID: 2408
Author's Name: A.A. Abdulnassar and Latha R. Nair
Volume: Volume 11
Issues: Volume 11
Keywords: Cluster, Data Mining, Kmeans Partition Algorithm, Cluster Count Prediction, Elbow and Silhouette Method.
Year: 2021
Month: July
Pages: 3792-3806