Comparative Analysis of ResNet50 and Vision Transformer on Paddy Disease Classification

Bhattarai, Bhawana

Comparative Analysis of ResNet50 and Vision Transformer on Paddy Disease Classification

Files

Fulltext.pdf (1.82 MB)

Date

2025

Authors

Bhattarai, Bhawana

Abstract

Plant diseases seriously affect our food supply, thereby affecting farmers, those dependent on farming, and global food security. Early detection of plant disease is critical for effective treatment and minimized yield losses. One of the useful uses of computer vision is the identification of plant diseases by analyzing leaf images. This study compares the ResNet50 model and the ViT model on the paddy disease classification task. Specifically, it trains these models using the Paddy Doctor dataset to evaluate their performance by modifying the learning rate and the number of training epochs. The Paddy Doctor dataset, which contains 16,225 images categoried into 13 different classes, was used to train and test the models. The ResNet50 model achieved a high training accuracy of 0.98. However, when evaluated on the test dataset, the model's performance decreased, achieving an accuracy of 0.92. On the other hand, the ViT model achieved a remarkably high training accuracy of 0.99. When evaluated on the test dataset, the ViT model maintained strong performance, with an accuracy of 0.93. These results indicate that the ResNet50 model outperforms the ViT model in terms of both training and test accuracy for the paddy disease classification task using the Paddy Doctor dataset. Keywords: Classification, Deep Learning, CNN, ResNet50, Disease, Paddy, Vision Transformer