Nepali Document Clustering using DBSCAN and OPTICS Algorithm

Maharjan, Prabin

Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/9845

Full metadata record

DC Field	Value	Language
dc.contributor.author	Maharjan, Prabin	-
dc.date.accessioned	2022-04-15T09:25:07Z	-
dc.date.available	2022-04-15T09:25:07Z	-
dc.date.issued	2018	-
dc.identifier.uri	https://elibrary.tucl.edu.np/handle/123456789/9845	-
dc.description.abstract	Automated document clustering is the process of grouping documents into a small sets of meaningful collections based on similarity between them. This research evaluates density based clustering algorithms namely Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering points to Identify Cluster Structure(OPTICS) algorithms using four performance metrics: Homogeneity, Completeness, V-Measure and Silhouette Coefficient on Nepali dataset. Features extraction is done using combination of Term Frequency – Inverse Document Frequency (TFIDF) with Latent Semantic Indexing (LSI). The results based on the performance metrics mentioned above shows that clustering result of DBSCAN is slightly better than OPTICS algorithm. The time required for processing is better for DBSCAN algorithm.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Department of Computer Science & Information Technology	en_US
dc.subject	Clustering	en_US
dc.subject	Machine learning	en_US
dc.subject	Nepali document clustering	en_US
dc.title	Nepali Document Clustering using DBSCAN and OPTICS Algorithm	en_US
dc.type	Thesis	en_US
local.institute.title	Central Department of Computer Science and Information Technology	en_US
local.academic.level	Masters	en_US
Appears in Collections:	Computer Science & Information Technology

Files in This Item:

File	Description	Size	Format
Full Thesis.pdf		1.21 MB	Adobe PDF	View/Open

Show simple item record

TUCL eLibrary

Easy and open access to all types of digital resources of TUCL