HETEROGENEOUS GRAPH ATTENTION NETWORK FOR SEMISUPERVISED NEWS CLASSIFICATION
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Pulchowk Campus
Abstract
Text Classification is one of the important tasks in Natural Language Processing. It involves
understanding the semantics and classifying the data into proper class and this method can be
further used in many other Natural Language Processing tasks. Since there is a huge amount
of data generated every day and one of the major data sources is in text format but finding
labeled text data is difficult. So, understanding semantics of such data and analysis has become
challenging. Recent trend in graph network has tried to map those raw data to meaningful
representations that have a great advantage over less amount of labeled data. The graph
network has tried to utilize less amount of labeled data along with unlabeled data and has
performed very well in such situations. This work has also explored that technique and has
tried to enhance the current work on short text classification. Here, use of heterogeneous graph
to represent the raw data has added more semantic to the network as most of the Real-world
data are in heterogeneous form. In this work, the raw news data is converted to 3 types of
nodes and connection(edges) between them which results in the heterogenous graph. Now the
heterogeneous neural network is applied to embed the graph to lower dimension. Also, the
dual level attention network was applied that has given more attention to more important
nodes and edges further increasing the performance of the model. The application of word
embedding using pretrained model has simplified the network, optimizing it’s both efficiency
and performance. The application of this model has outperformed pervious model in
classifying the short news data. In AgNews dataset, the accuracy is 76.3% and in TagMyNews
dataset the accuracy is 59.7% that is greater than the previous applied model by more than 4%
and 3% respectively. Other visual and comprehensive evaluation also shows that the model
performed well with less amount of data.
Description
Text Classification is one of the important tasks in Natural Language Processing. It involves
understanding the semantics and classifying the data into proper class and this method can be
further used in many other Natural Language Processing tasks.
Citation
MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING
