Comparing Vision Transformers and CNNs for Accurate Retinal Disease Classification
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Retinal diseases, such as Age-Related Macular Degeneration (AMD) and Diabetic Macular Edema (DME) significanty contribute to vision impairment in global scale. An early diagnosis and timely treatment can save a lot of people form blindness. This research focuses on leveraging Optical Coherence Tomography (OCT) images for the classification of retinal diseases using advanced deep learning models. Specifically, we explore the capabilities of Vision Transformers (ViTs), Convolutional Neural Networks (CNNs), and a proposed Hybrid CNN-Transformer model (HybridCNNViT). The HybridCNNViT model was developed by combining the local feature extraction strengths of CNNs with the global context modeling capabilities of Transformers. Comparative evaluations of accuracy, precision, and computational efficiency revealed that HybridCNNViT outperforms standalone ViTs and CNNs for retinal disease classification. As it offers a promising approach to improve healthcare outcomes in ophthalmology, can be further improved and used in applications of automated retinal disease detection and clinical diagnostics.
Description
Keywords
Vision Transformers, Convolutional Neural Networks, HybridCNNViT, Retinal Disease Classification, Optical Coherence Tomography, Deep Learning in Healthcare, Medical Image Analysis