HETEROGENEOUS GRAPH ATTENTION NETWORK FOR SEMISUPERVISED NEWS CLASSIFICATION

Devkota, Sujil

Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/7673

Title:	HETEROGENEOUS GRAPH ATTENTION NETWORK FOR SEMISUPERVISED NEWS CLASSIFICATION
Authors:	Devkota, Sujil
Keywords:	Text Classification,;Heterogenous Graph Network,;Attention Network,;Word Embedding
Issue Date:	Aug-2021
Publisher:	Pulchowk Campus
Institute Name:	Institute of Engineering
Level:	Masters
Citation:	MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING
Abstract:	Text Classification is one of the important tasks in Natural Language Processing. It involves understanding the semantics and classifying the data into proper class and this method can be further used in many other Natural Language Processing tasks. Since there is a huge amount of data generated every day and one of the major data sources is in text format but finding labeled text data is difficult. So, understanding semantics of such data and analysis has become challenging. Recent trend in graph network has tried to map those raw data to meaningful representations that have a great advantage over less amount of labeled data. The graph network has tried to utilize less amount of labeled data along with unlabeled data and has performed very well in such situations. This work has also explored that technique and has tried to enhance the current work on short text classification. Here, use of heterogeneous graph to represent the raw data has added more semantic to the network as most of the Real-world data are in heterogeneous form. In this work, the raw news data is converted to 3 types of nodes and connection(edges) between them which results in the heterogenous graph. Now the heterogeneous neural network is applied to embed the graph to lower dimension. Also, the dual level attention network was applied that has given more attention to more important nodes and edges further increasing the performance of the model. The application of word embedding using pretrained model has simplified the network, optimizing it’s both efficiency and performance. The application of this model has outperformed pervious model in classifying the short news data. In AgNews dataset, the accuracy is 76.3% and in TagMyNews dataset the accuracy is 59.7% that is greater than the previous applied model by more than 4% and 3% respectively. Other visual and comprehensive evaluation also shows that the model performed well with less amount of data.
Description:	Text Classification is one of the important tasks in Natural Language Processing. It involves understanding the semantics and classifying the data into proper class and this method can be further used in many other Natural Language Processing tasks.
URI:	https://elibrary.tucl.edu.np/handle/123456789/7673
Appears in Collections:	Electronics and Computer Engineering

Files in This Item:

File	Description	Size	Format
thesis-final report.pdf		1.82 MB	Adobe PDF	View/Open

Show full item record

TUCL eLibrary

Easy and open access to all types of digital resources of TUCL