Honors Thesis

A Parallel Algorithm for Approximating the Optimal Alignment of Two Genome-Scale DNA Sequences

Thomas Royce ‘02

Fast, large scale squence alignment algorithms have been difficult to achieve due to the large amounts of computer resources that they require. The computational burden stems from alignment algorithms’ reliance upon greedy dynamic programming techniques.

This thesis presents a sequential heuristic algorithm that limits the use of dynamic programming to relatively small sub-alignments. Instead, large subsequences of high similarity are identified first and added to a global alignment. Dynamic programming methods are then used to resolve the spaces between these regions of high similarity. The sequential algorithm achieves sequence alignment in linear space and time. Furthermore, the memory usage requirements are lower than any previously published work.

Stemming from the sequential algorithm, the first known parallel algorithm for large-scale alignments is also presented. Subsequence matching and resolving gaps between them account for a bulk of the sequential algoirthm’s time complexity. These tasks are parallelized using message passing and simple load-balancing techniques. The overall work required of the system increases linearly with the size of the inputs.

Peter Djalaliev Main List Heather Mahaney

Print - Changes - Search
Last modified: May 06, 2008, at 12:28 PM.