Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#706 closed request (fixed)

Generalization of Needleman-Wunsch and Smith-Waterman

Reported by: Peter Owned by: Peter
Priority: major Milestone: yat 0.9
Component: utility Version: trunk
Keywords: Cc:


NW-alignment and SW-alignment are special cases of a more general alignment algorithm. I suggest implement this general alignment algorithm and refactor NeedlemanWunsch and SmithWaterman? functions using this algorithm.

There are four differences between NW and SW (in our versions):

  1. Our NW doesn't have an open_gap term, i.e., open_gap=0.
  2. NW has an interface to retrieve the alignment, i.e., which elements aligned.
  3. The Matrix, m, is initialized differently. In NW m(0,n)=m(n,0)=-n*gap and -inf rest of Matrix whereas in SW m(n,m)=0.
  4. In NW returned score is lower right element of m, i.e., m(m.rows()-1, m.columns()-1) whereas in SW max(m) is returned.

To create an algorithm that covers these differences we need:

  1. Support open_gap term (perhaps even add it to NW API)
  2. A way to retrieve the alignment, probably as separate function in a class, so one first calls Aligner::operator() and then Aligner::alignment()
  3. Matrix, m, is initialized outside class and passed to operator()
  4. Matrix m is returned C-style by being passed as a Matrix&.

The most unclear thing is 2.) what is the format to retrieve the alignment. I don't like the format in NW where the path is stored as vector<pair<size_t, size_t> >. Perhaps the best option is to store the Matrix that is called choice in NW and array in SW. But I see no need to store this info in a float Matrix when there are only four cases (see enum in SW).

Change History (4)

comment:1 Changed 10 years ago by Peter

Milestone: yat 0.x+yat 0.9
Status: newassigned

comment:2 Changed 10 years ago by Peter

If we wanna be general, there is no need to keep symmetry that score for insertion/deletions are the same.

comment:3 Changed 10 years ago by Peter

Resolution: fixed
Status: assignedclosed

(In [2815]) closes #706. Implement a generalization of SmithWaternan? and NeedlemanWunsch? in a new class utility::Aligner. Class also provide a way to access not only alignment score but also which elements were aligned (to maximize the score). Old function SmithWaterman? did not provide this functionality and is deprecated for new member function Aligner::smith_waterman.

comment:4 Changed 10 years ago by Peter

ticket #724 was marked as related

Note: See TracTickets for help on using tickets.