Automatic construction of dictionary on English-Nepali parallel corpus

dc.contributor.authorSaud, Lokendra Bahadur
dc.date.accessioned2023-05-22T05:59:02Z
dc.date.available2023-05-22T05:59:02Z
dc.date.issued2010
dc.description.abstractThis dissertation describes an approach based on word alignment on parallel corpora, which aims at facilitating the lexicographic work of dictionary building. The proposed model of dictionary construction first perform the tokenization of input text and then TnT tagger is used for tagging the tokenized text, after this the word alignment is done to find out the word pair form source (English)language and target (Nepali) language. Finally our dictionary generation algorithm generate the sample dictionary formed from the word that made the given input text. Our model does rely on the information from tagging as well. Hence the model accuracy not only depends on the alignment algorithms and the training corpus but also depends on the accuracy of tagger. This corpus-driven technique, in particular the exploitation of parallel corpora, proved to be helpful in the creation of bilingual dictionaries for several reasons. Most importantly, a parallel corpus of appropriate size guarantees that the most relevant translations are included in the dictionary.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.14540/17237
dc.language.isoen_USen_US
dc.publisherDepartment of Computer Science and Information Technologyen_US
dc.subjectAutomatic constructionen_US
dc.subjectDictionaryen_US
dc.subjectenglish Nepali parallelen_US
dc.titleAutomatic construction of dictionary on English-Nepali parallel corpusen_US
dc.typeThesisen_US
local.academic.levelMastersen_US
local.institute.titleCentral Department of Computer Science and Information Technologyen_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Cover(1).pdf
Size:
43.02 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Chapter(2).pdf
Size:
386.84 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: