Please use this identifier to cite or link to this item:
https://elibrary.tucl.edu.np/handle/123456789/18840
Title: | NEURAL AUDIO CODEC |
Authors: | BARAL, SUBODH PANDEY, TAPENDRA BURLAKOTI, ACHYUT BARAL, SIJAL |
Keywords: | Audio Compression,;Deep Learning,;Audio Codec |
Issue Date: | 30-Apr-2023 |
Publisher: | I.O.E. Pulchowk Campus |
Institute Name: | Institute of Engineering |
Level: | Bachelor |
Abstract: | Neural audio codecs that use end-to-end approaches have gained popularity due to their ability to learn efficient audio representations through data-driven methods, without relying on handcrafted signal processing components. This research paper evaluates the performance of Neural Audio Codec in comparison to traditional audio codecs Opus and EVS in terms of audio quality and efficiency. The study highlights the limitations of existing audio codecs in leveraging the abundant data available in the audio compression pipeline and proposes deep learning-based models as a potential solution. The paper reviews recent advancements in deep learning-based audio synthesis and representation learning and explores the potential of deep learning-based audio codecs in enhancing compression efficiency. The study also addresses the limitations of existing models, including slower training times and increased memory requirements, by releasing open-source code and pre-trained models for further research and improvement. Experimental results show that our approach has comparable performance to widely used commercial codec OPUS at low bitrate, and a slight drop in performance compared to current deep learning-based frameworks but at the expense of significant improvement in speed and memory requirements. We have released our code and pre-trained models at https://github.com/AchyutBurlakoti/Neural-Audio-Compression for further research and improvement. |
Description: | Neural audio codecs that use end-to-end approaches have gained popularity due to their ability to learn efficient audio representations through data-driven methods, without relying on handcrafted signal processing components. |
URI: | https://elibrary.tucl.edu.np/handle/123456789/18840 |
Appears in Collections: | Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Subodh Baral et al. be report computer apr2023.pdf | 1.22 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.