Parallelization of Star Alignment Algorithm for Multiple Sequence Alignment using MapReduce Model

Ansari, Md Hasan

Parallelization of Star Alignment Algorithm for Multiple Sequence Alignment using MapReduce Model

Files

Thesis_071MSCS655.pdf (1.45 MB)

Date

2017-04

Authors

Ansari, Md Hasan

Publisher

Pulchowk Campus

Abstract

Multiple sequence alignment (MSA) is an important problem in molecular biology. Biological sequences are aligned with each other vertically to show possible similarities or differences among these sequences. To solve an MSA problem is to find an alignment of multiple sequences with the highest score based on a given scoring criterion among sequences. Dynamic programming algorithms like Needleman-Wunch and Smith-Waterman produce accurate alignments but these algorithms are computation intensive, computational complexity of O(n2) and are limited to a small number of short sequences. Similarly multiple sequence alignment that processes the sequences one by one, called star alignment, takes time until O(k2n2). However the computation result still has high accuracy. Consequently, it is very important to get a better way to improve the performance. To achieve this, a MapReduce model of star alignment is designed and implemented that executes in parallel on a hadoop clusters. Since hadoop already handles work/job dispatching and work balance among distributed worker nodes, we need note handle node failure and load balancing required with the traditional distributed computing. The experimental result shows that the MapReduce model of star alignment improve the execution time by 3 times with 8 physical nodes than single node with datasets size greater than 1 GB.

Description

Multiple sequence alignment (MSA) is an important problem in molecular biology.

Keywords

Bioinformatics, Multiple Sequence Alignment, Needleman-wunch,, Star Alignment, Parallelization, Hadoop, MapReduce.

Citation

MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING

URI

https://hdl.handle.net/20.500.14540/7649

Collections

Electronics and Computer Engineering

Full item page

Parallelization of Star Alignment Algorithm for Multiple Sequence Alignment using MapReduce Model

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections