A Computational Analysis of Nepali Morphology: A Model for Natural Language Processing
Date
2011
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Linguistics
Abstract
The main goal of this study is to present a computational analysis of morphology in
Nepali for developing a model for natural language processing by applying the finite
state approach. The morphological categories have been analyzed according to the
principle of Two-level morphology (Koskeniemmi 1983), and these categories have
been implemented using Xerox finite state tool (Beesley and Kartumnen 2003) to
create the morphological analyzer. A version of finite state automaton called finite
state transducer is used in this study which handles relation between two languages,
namely upper language and lower language. Upper language is equivalent to lexical
level and lower language is equivalent to surface level. The finite state transducer is
bidirectional, i.e., moving from surface level to lexical level is analysis and from
lexical level to surface level is generation.
This study is organized into eight chapters. Chapter 1 presents the general
morphological concepts, the objectives, methodology, the significance and limitations
of the study. Chapter 2 presents the theoretical framework that is adopted for the
study. Chapter 3 analyzes nouns, pronouns, adjectives, numerals and classifiers in
Nepali. Chapter 4 analyzes the verbs in Nepali from computational approach in the
first part and verbal inflections in the second part. Chapter 5 deals with indeclinable
words in Nepali. Chapter 6 analyzes the derivational process. Chapter 7 implements
the outcome of analysis in previous chapters into a finite state transducer using Xerox
Finite State Tool. Chapter 8 summarizes the findings of the study.
This study has identified fourteen groups of nouns, eight groups of pronouns, four
groups of adjectives, one group of cardinal numerals, two groups of ordinal numerals,
three groups of classifiers, ten groups of verbs, seven groups of adverbs, two groups
of conjunctions, three groups of postpositions, one group of particles and fifteen
groups of derivations in Nepali. The phonological rules for each group have also been
identified. The finite state transducer for each group with corresponding
morphological tags and phonological rules have been created; and all of them have
been put together into a single transducer which can be used as a morphological
analyzer for Nepali.
Description
Keywords
Computational analysis, Language processing, Nepali morphology