Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/7649
Title: Parallelization of Star Alignment Algorithm for Multiple Sequence Alignment using MapReduce Model
Authors: Ansari, Md Hasan
Keywords: Bioinformatics, Multiple Sequence Alignment, Needleman-wunch,;Star Alignment, Parallelization, Hadoop, MapReduce.
Issue Date: Apr-2017
Publisher: Pulchowk Campus
Institute Name: Institute of Engineering
Level: Masters
Citation: MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING
Abstract: Multiple sequence alignment (MSA) is an important problem in molecular biology. Biological sequences are aligned with each other vertically to show possible similarities or differences among these sequences. To solve an MSA problem is to find an alignment of multiple sequences with the highest score based on a given scoring criterion among sequences. Dynamic programming algorithms like Needleman-Wunch and Smith-Waterman produce accurate alignments but these algorithms are computation intensive, computational complexity of O(n2) and are limited to a small number of short sequences. Similarly multiple sequence alignment that processes the sequences one by one, called star alignment, takes time until O(k2n2). However the computation result still has high accuracy. Consequently, it is very important to get a better way to improve the performance. To achieve this, a MapReduce model of star alignment is designed and implemented that executes in parallel on a hadoop clusters. Since hadoop already handles work/job dispatching and work balance among distributed worker nodes, we need note handle node failure and load balancing required with the traditional distributed computing. The experimental result shows that the MapReduce model of star alignment improve the execution time by 3 times with 8 physical nodes than single node with datasets size greater than 1 GB.
Description: Multiple sequence alignment (MSA) is an important problem in molecular biology.
URI: https://elibrary.tucl.edu.np/handle/123456789/7649
Appears in Collections:Electronics and Computer Engineering

Files in This Item:
File Description SizeFormat 
Thesis_071MSCS655.pdf1.48 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.