[BiO BB] Testing a smith-waterman algorithm?

Martin Gollery marty.gollery at gmail.com
Sat Mar 11 11:11:22 EST 2006


Yes, I believe you've got it. A local alignment between
"disestabishmentarianism" and
"reestablishmentSomeNonMatchingPart" would match establishment, but a fully
global alignment would force the leading d to match the leading r and the
trailing m to match the trailing t.

Marty



On 3/11/06, Theodore H. Smith <delete at elfdata.com> wrote:
>
>
> Hi people,
>
> I've successfully designed, written and compiled a program that uses
> the smith-waterman algorithm.
>
> Nothing new there, but it's for an interesting project, and before
> the project is complete, perhaps some questions asked to
> bioinformaticians can help bring me up to your level.
>
> The next stage after compiling, is testing my algorithm. I now must
> write some tests for my code.
>
> This is where I am seeing that I'm unsure if I even understand Smith-
> Waterman properly! I understand Levenshtein OK (similar to Needleman-
> Wunsch), but Smith-Waterman I'm a bit unclear on.
>
> Mostly I'm wondering exactly how does local matching help us, over
> global matching. I got a lay person's description of why it helps,
> but I'm more interested in getting an exact feel for it.
>
> Does it make sense to use English words as an example here, instead
> of protein sequences? That would help me understand this a bit
> better, as I have a better feel for English than proteins (unlike
> many of you).
>
> Would then the main advantage be, for searching for short sequences
> within long ones, without being unfairly penalised by the non-
> matching ends of the long sequence?
>
> For example: "extrapolate" could match "extra", far better in Smith-
> Waterman than it could using Levenshtein, because we aren't being
> penalised so badly by the "polate" part.
>
> Or perhaps: "specialisation" would match "lisation" far better using
> local than global, because we aren't being penalised by the "specia"
> part so much.
>
> Or even: "disestablishmentarianism" would match "establishment" far
> better using local than global, because we aren't being penalised by
> "dis" or "arianism".
>
> Is that how local searches like Smith-Waterman benefit us?
>
>
> What about when we are searching for two long sequences of which only
> a small part will match?
>
> Let's say "disestabishmentarianism" against
> "reestablishmentSomeNonMatchingPart".
>
> A local alignment should be able to figure out that "establishment"
> aligns well in this case.
>
> Is that basically how Smith-Waterman helps us?
>
> --
> http://elfdata.com/plugin/
>
>
>
> _______________________________________________
> Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>



--
--
Martin Gollery
Associate Director
Center For Bioinformatics
University of Nevada at Reno
Dept. of Biochemistry / MS330
775-784-7042
-----------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060311/cf4321cf/attachment.html>


More information about the BBB mailing list