#706 closed request (fixed)
Generalization of Needleman-Wunsch and Smith-Waterman
Reported by: | Peter | Owned by: | Peter |
---|---|---|---|
Priority: | major | Milestone: | yat 0.9 |
Component: | utility | Version: | trunk |
Keywords: | Cc: |
Description
NW-alignment and SW-alignment are special cases of a more general alignment algorithm. I suggest implement this general alignment algorithm and refactor NeedlemanWunsch and SmithWaterman? functions using this algorithm.
There are four differences between NW and SW (in our versions):
- Our NW doesn't have an open_gap term, i.e.,
open_gap=0
. - NW has an interface to retrieve the alignment, i.e., which elements aligned.
- The Matrix, m, is initialized differently. In NW m(0,n)=m(n,0)=-n*gap and
-inf
rest of Matrix whereas in SW m(n,m)=0. - In NW returned score is lower right element of m, i.e., m(m.rows()-1, m.columns()-1) whereas in SW max(m) is returned.
To create an algorithm that covers these differences we need:
- Support open_gap term (perhaps even add it to NW API)
- A way to retrieve the alignment, probably as separate function in a class, so one first calls Aligner::operator() and then Aligner::alignment()
- Matrix, m, is initialized outside class and passed to operator()
- Matrix m is returned C-style by being passed as a Matrix&.
The most unclear thing is 2.) what is the format to retrieve the alignment. I don't like the format in NW where the path is stored as vector<pair<size_t, size_t> >. Perhaps the best option is to store the Matrix that is called choice in NW and array in SW. But I see no need to store this info in a float Matrix when there are only four cases (see enum in SW).
Change History (4)
comment:1 Changed 11 years ago by
Milestone: | yat 0.x+ → yat 0.9 |
---|---|
Status: | new → assigned |
comment:2 Changed 11 years ago by
comment:3 Changed 11 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
(In [2815]) closes #706. Implement a generalization of SmithWaternan? and NeedlemanWunsch? in a new class utility::Aligner. Class also provide a way to access not only alignment score but also which elements were aligned (to maximize the score). Old function SmithWaterman? did not provide this functionality and is deprecated for new member function Aligner::smith_waterman.
If we wanna be general, there is no need to keep symmetry that score for insertion/deletions are the same.