Performance analysis of Naive Bayes and support vector machine algorithm on classification of Nepali opinion text
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Information Technology
Abstract
Opinion is a subjective expression of individual on something. These are views, emotions
or sentiments. The opinion helps individual and organization to make decision about the
certain things. The opinion classification is the process of analyzing the view or opinion
using the natural language processing techniques. The Naïve Bayes and Support Vector
Machine (SVM) algorithm are supervised machine learning algorithm for classification.
Most of the researches in opinion classification are done in English language but it is
important to perform the opinion classification in Nepali language as the amount of data
in Nepali is increasing rapidly in the form of blog, review, opinion column in newspaper.
Nepali sentences were collected from the opinion section of different online portal of
national newspaper in this study. The python programming language was used for
implementing both algorithms with NLTK library and output were analyzed on the basis
of performance metrics. The accuracy of SVM is 85% which is higher than accuracy of
Naïve Bayes algorithm i.e. 83% on preprocessed the data. The accuracy of both
algorithms was improved after preprocessing as compared to without preprocessing the
data. The Study concluded SVM model was the best model with higher values of
performance metrics and is recommended for opinion classification of Nepali text data
over the Naïve Bayes algorithm.
