Nepali Document Clustering using DBSCAN and OPTICS Algorithm

dc.contributor.authorMaharjan, Prabin
dc.date.accessioned2022-04-15T09:25:07Z
dc.date.available2022-04-15T09:25:07Z
dc.date.issued2018
dc.description.abstractAutomated document clustering is the process of grouping documents into a small sets of meaningful collections based on similarity between them. This research evaluates density based clustering algorithms namely Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering points to Identify Cluster Structure(OPTICS) algorithms using four performance metrics: Homogeneity, Completeness, V-Measure and Silhouette Coefficient on Nepali dataset. Features extraction is done using combination of Term Frequency – Inverse Document Frequency (TFIDF) with Latent Semantic Indexing (LSI). The results based on the performance metrics mentioned above shows that clustering result of DBSCAN is slightly better than OPTICS algorithm. The time required for processing is better for DBSCAN algorithm.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.14540/9845
dc.language.isoen_USen_US
dc.publisherDepartment of Computer Science & Information Technologyen_US
dc.subjectClusteringen_US
dc.subjectMachine learningen_US
dc.subjectNepali document clusteringen_US
dc.titleNepali Document Clustering using DBSCAN and OPTICS Algorithmen_US
dc.typeThesisen_US
local.academic.levelMastersen_US
local.institute.titleCentral Department of Computer Science and Information Technologyen_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Full Thesis.pdf
Size:
1.19 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: