Browsing by Issue Date, starting with "2023-04-30"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item AUGMENTING SELF-LEARNING AGENT IN FIRST-PERSON SHOOTER GAME USING REINFORCEMENT LEARNING(I.O.E. Pulchowk Campus, 2023-04-30) SINGH, SAMRAT; NEUPANE, SKEIN; PANDEY, SUSHANT; JOSHI, YACHU RAJAThis group project highlights the effectiveness of utilizing reinforcement learning (RL) along with the Proximal Policy Optimization (PPO) algorithm to train an agent to play aWolfenstein3Dlike game with multiple levels. The agent exhibited exceptional performance in relation to reward, time efficiency, and overall effectiveness. An in-depth analysis of its performance indicated marked enhancements in the reward curves, strategic navigation throughout the game levels, and expeditious completion of each level. The study highlights the potential of RL and PPO for training agents in complex video games with multiple levels, as well as in other applications such as agent-based modeling and machine learning.Item “INFORMATION EXTRACTION FROM UNSTRUCTURED DATA”(I.O.E. Pulchowk Campus, 2023-04-30) LAMMICHHANE, AAYUSH; NEUPANE, AAYUSH; PAUDEL, ANKIT; LAMSAL, ASHISHIn today’s digital age, the digitization of paper documents like invoices and receipts has taken on more significance. Nevertheless, manually entering data from these papers can take a lot of time and be prone to mistakes, which causes inefficiencies and drives up expenses for enterprises. To solve this issue, we created a software platform that automates the process of collecting important data from scanned documents using deep learning technology, more specifically the LayoutLM architecture. Users can upload their scanned papers in bulk to our platform and choose which fields, including date, merchant name, and total amount, they want to extract. The technology is scalable and can manage high document volumes while preserving precision and effectiveness.The user-friendly interface makes it easy for users to upload and extract information from their scanned documents. Our platform offers significant benefits, including increased efficiency, accuracy, and cost savings, and has the potential to transform the way businesses handle physical documents. In this project, we will provide an overview of our software platform, including the technology behind it, its key features, and its potential applications.Item KEYPHRASE DETECTION AND QUESTION GENERATION FROM TEXT USING MACHINE LEARNING(I.O.E. Pulchowk Campus, 2023-04-30) LAMICHHANE, AAYUSHQuestion Generation may not be as prominent as Question Answering but it still remains a relevant task in NLP. The ability to ask meaningful questions provides evidence towards comprehension within an Artificial Intelligence (AI) model. This makes the task of question generation important in the bigger picture of AI. While existing question generation techniques rely on complex model architectures and additional mechanisms to boost performance, we show that transformer-based fine-tuning techniques can create robust question generating systems using only a single language model,, without the use of additional mechanisms, answer metadata, and extensive features. Some training parameters of our project are : epoch :10, batchsize : 4, learning rate 10e-3.Lastly, we also look into the model’s failure modes and identify possible reasons why the model fails.Item INFORMATION EXTRACTION FROM STRUCTURED DOCUMENT(I.O.E. Pulchowk Campus, 2023-04-30) KANU, AAYUSH SHAH; POKHREL, ADITHYA; BASHYAL, BISHAL; SHARMA, JANAKThis project proposes the use of the LayoutLMv2 model, a deep learning model, for information extraction from form-like documents. Specifically, the IRS 990 tax form was used as the dataset for testing and optimization. The information extraction process from form-like documents can be challenging due to the complex layout analysis and text recognition required to identify fields and corresponding values. The proposed model, LayoutLMv2, has demonstrated its effectiveness in these tasks, making it a promising solution for information extraction from form-like documents. The project resulted in the development of a web application and annotation tools that provide users with a user-friendly interface to upload documents and extract relevant information accurately and efficiently. The annotation tool enables users to label data and train custom models, while the web application streamlines document processing for businesses and organizations.Item LIVE CAMERA FEED SCENE DESCRIPTOR FOR VISUALLY IMPAIRED(I.O.E. Pulchowk Campus, 2023-04-30) SUBEDI, AADITYA MANI et alEvery visually impaired people wants to interact with their nature, surrounding and people. They want to feel the event happening on the nature but are naturally deprived. Our product aims at assisting visually impaired individuals in navigating their way around Pulchowk Campus and describing the actions happening inside the campus. The product utilizes a live camera feed and visual transformer techniques to generate a descriptive caption and its audio output, providing the user with a proper and timely description of their surroundings. The product is designed to work on some landmark of the campus and wide range of activities. We suggest a model that is fine tuned on the pre-trained Git-base-Vatex model in our campus video datasets to describe the surrounding scene.Item PARAPHRASE GENERATION OF NEPALI LANGUAGE IN DEVANAGARI SCRIPT USING NATURAL LANGUAGE PROCESSING(I.O.E. Pulchowk Campus, 2023-04-30) SAPKOTA, AAJAY; ACHARYA, ABINASH; BADE, ANISH; UPRETI, MAHESHThe project aims to develop a system for generating paraphrases using transformer-based models. Fine-tuning the pre-trained models on a large-scale dataset of sentence pairs, consisting of source sentences and their corresponding paraphrases, and evaluation of their performance on several benchmarks was performed. To accomplish the project’s objectives, several tasks were undertaken, such as researching and allocating resources, collecting and translating datasets, sampling, filtering, and analyzing the feasibility of the model. The comprehensive approach employed in the project has enabled the development of a powerful tool for generating high-quality paraphrases, which could enhance the natural language processing and generation capabilities of various applications. Moreover, this model excels in utilizing mathematical and statistical metrics such as BLEU and ROUGE scores to accurately assess paraphrasing. Additionally, the model demonstrated excellent performance on different datasets, showcasing its ability to generalize across different types of test sets. But, the zero-shot evaluation produced a result not so expected, suggesting a low recall score for new sentences which highlighted the need for further improvements in the model. Similarly, this model faces significant challenges such as entity mismatches, semantic and syntactic differences, and exact match problems between the input sentences and their corresponding generated sentences. Furthermore, the implementation of a web application enabled users to input sentences and receive their paraphrases in real time, demonstrating the practicality of our approach. Nonetheless, this research emphasizes the vast potential of advanced language models to enhance natural language processing capabilities in low-resource languages.Item NEURAL AUDIO CODEC(I.O.E. Pulchowk Campus, 2023-04-30) BARAL, SUBODH; PANDEY, TAPENDRA; BURLAKOTI, ACHYUT; BARAL, SIJALNeural audio codecs that use end-to-end approaches have gained popularity due to their ability to learn efficient audio representations through data-driven methods, without relying on handcrafted signal processing components. This research paper evaluates the performance of Neural Audio Codec in comparison to traditional audio codecs Opus and EVS in terms of audio quality and efficiency. The study highlights the limitations of existing audio codecs in leveraging the abundant data available in the audio compression pipeline and proposes deep learning-based models as a potential solution. The paper reviews recent advancements in deep learning-based audio synthesis and representation learning and explores the potential of deep learning-based audio codecs in enhancing compression efficiency. The study also addresses the limitations of existing models, including slower training times and increased memory requirements, by releasing open-source code and pre-trained models for further research and improvement. Experimental results show that our approach has comparable performance to widely used commercial codec OPUS at low bitrate, and a slight drop in performance compared to current deep learning-based frameworks but at the expense of significant improvement in speed and memory requirements. We have released our code and pre-trained models at https://github.com/AchyutBurlakoti/Neural-Audio-Compression for further research and improvement.Item GUIDANCE, NAVIGATION AND CONTROL OF A VTOL VEHICLE TO MAKE IT FOLLOW A PREDETERMINED TRAJECTORY(I.O.E. Pulchowk Campus, 2023-04-30) PATHAK, SAKAR; ACHARYA, SAMUNDRA; SILWAL, SHREEJAN SINGH; DHAKAL, SWAYAMIn interplanetary missions, the landing of space vehicles is typically accomplished using parachutes. However, this simple method is not without its challenges, as these vehicles are prone to parachute drifts that are difficult to predict, especially on planets with dense atmospheres like Earth. As a result, significant attention has recently been given to the development of active control systems for space vehicles, allowing for precise guidance, navigation, and control over predetermined trajectories and enabling soft and accurate landings on planetary surfaces. The ability to follow a predetermined path and land softly and precisely using real-time onboard control algorithms would greatly enhance the capabilities of vehicles for interplanetary travel, while also increasing the re-usability of space vehicles. This not only benefits interplanetary travel but also improves space payload delivery systems by reducing costs and increasing efficiency. To this end, this project aims to implement control algorithms on an Electric Ducted Fan (EDF) powered model of a Vertical Take Off and Landing (VTOL) vehicle, enabling it to follow a fixed trajectory. A small CanSat payload will be attached to the vehicle and deployed at a specific altitude, simulating the tasks required of a full-scale vehicle. By utilizing these advanced control systems, space vehicles can navigate more accurately and efficiently, reducing the risks and costs associated with interplanetary travel. With a focus on trajectory control and precision landing, this project aims to contribute to the ongoing efforts to enhance space exploration and technology development.Item QUESTION SIMILARITY DETECTION AND ANALYSIS(I.O.E. Pulchowk Campus, 2023-04-30) SHRESTHA, MILAN; SHAKYA, NISCHAL; SWARNAKAR, NITESH; SUBEDI, ROSHANThe project aims to explore the effectiveness of using the SBERT model and vector database for performing question similarity analysis. The project involves building a vector database by training a sentence transformer model on a large corpus of text data. The vector dataset is then used to perform question similarity analysis by retrieving similar questions and similarity scores to a given search query. The model is trained on a large corpus of ALLNLI datasets, other paraphrase datasets such as MRPC, and PAWS, and the semantic similarity of datasets such as STS and finally adapted on 9,282 custom-prepared engineering datasets. The sentence transformer model is trained using the aforementioned datasets with MNR Loss as the loss function. The effectiveness of the model is evaluated by using the STS test dataset and test set of the MRPC. The result of the project demonstrates that using a sentence transformer model and vector database for question similarity analysis outperforms the baseline method of keyword matching. The approach achieved a spearman correlation value of 0.863 on the STS benchmark and an accuracy of 88.7% on the MRPC test. The Spearman correlation value in the SBERT paper for the NLI-large dataset was below 0.80. These values show that continuous training of the model on other datasets besides NLI helps to increase the performance and performs better for downstream tasks. This suggests that the use of the sentence transformer model and vector database is a promising approach for performing question similarity analysis, which could have significant implications for information retrieval systems.