Please use this identifier to cite or link to this item:
https://elibrary.tucl.edu.np/handle/123456789/7649
Title: | Parallelization of Star Alignment Algorithm for Multiple Sequence Alignment using MapReduce Model |
Authors: | Ansari, Md Hasan |
Keywords: | Bioinformatics, Multiple Sequence Alignment, Needleman-wunch,;Star Alignment, Parallelization, Hadoop, MapReduce. |
Issue Date: | Apr-2017 |
Publisher: | Pulchowk Campus |
Institute Name: | Institute of Engineering |
Level: | Masters |
Citation: | MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING |
Abstract: | Multiple sequence alignment (MSA) is an important problem in molecular biology. Biological sequences are aligned with each other vertically to show possible similarities or differences among these sequences. To solve an MSA problem is to find an alignment of multiple sequences with the highest score based on a given scoring criterion among sequences. Dynamic programming algorithms like Needleman-Wunch and Smith-Waterman produce accurate alignments but these algorithms are computation intensive, computational complexity of O(n2) and are limited to a small number of short sequences. Similarly multiple sequence alignment that processes the sequences one by one, called star alignment, takes time until O(k2n2). However the computation result still has high accuracy. Consequently, it is very important to get a better way to improve the performance. To achieve this, a MapReduce model of star alignment is designed and implemented that executes in parallel on a hadoop clusters. Since hadoop already handles work/job dispatching and work balance among distributed worker nodes, we need note handle node failure and load balancing required with the traditional distributed computing. The experimental result shows that the MapReduce model of star alignment improve the execution time by 3 times with 8 physical nodes than single node with datasets size greater than 1 GB. |
Description: | Multiple sequence alignment (MSA) is an important problem in molecular biology. |
URI: | https://elibrary.tucl.edu.np/handle/123456789/7649 |
Appears in Collections: | Electronics and Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thesis_071MSCS655.pdf | 1.48 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.