Ranking unstructured documents in IR (A comparative study of vector space model and latent semantic indexing model)

dc.contributor.authorRana, Jamir
dc.date.accessioned2023-11-20T04:32:35Z
dc.date.available2023-11-20T04:32:35Z
dc.date.issued2012
dc.description.abstractFor thousands of years people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information; and finding useful information from such collections became a necessity. The field of Information Retrieval (IR) was born in the 1950s out of this necessity. Over the last fifty years, the field has matured considerably. Several IR systems are used on an everyday basis by a wide variety of users. The goal of information retrieval (IR) is to provide users with those documents that will satisfy their information need. Various Models of Information retrieved have been implemented like Boolean Model, Vector Space Model, Probabilistic Model and so on, among these models Vector Space Model (VSM) and Latent Semantic Indexing Model (LSI) are also promising models being used till date. The main concern of the study is to rank the documents and find out whether LSI Model overcomes the problems of VSM when the problems are attached with synonyms and polysemys while ranking documents. The implemented features of these models like how to represent documents and query as vectors in R |v| , term-document matrix, term-weighting, cosine similarity, SVD decomposition, dimensionality reduction and its effect in results of LSI have been presented. Precision and recall have been implemented to know the effectiveness of the system. Conclusions have been drawn and future recommendation has been provided for better improvement.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.14540/20595
dc.language.isoen_USen_US
dc.publisherDepartment of Computer Science and Information Technologyen_US
dc.subjectUnstructured documentsen_US
dc.subjectIndexing modelen_US
dc.titleRanking unstructured documents in IR (A comparative study of vector space model and latent semantic indexing model)en_US
dc.typeThesisen_US
local.academic.levelMastersen_US
local.institute.titleCentral Department of Computer Science and Information Technologyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Full thesis.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: