[BiO BB] Observation: multiple sequence alignment affected by theinput sequence order

Mike Marchywka marchywka at hotmail.com
Mon Aug 27 15:16:18 EDT 2007


Does there exist an automated test script somewhere for validating clustalw 
builds?
I tried to send e-mail to the "README" contact points in their distribution 
but
at least 2 bounced, not sure on third yet. Anyway, I have a c++ version as I 
needed
to modify the output to work with some other code I'm writing. That worked-
if anyone wants to convert the clustalw data structures to an 
std::vector<SeqTy>
I have a reasonably contained class to do that that appears to work
( SeqTy is just a shell right now with sequence and name but you can add to 
it ).
However, I then went ahead
and got the rest of their code to build under c++ but I don't have anyway to
test beyond my immediate interests ( there is a lot of hard to follow
stuff with memory allocation and 0/1 base subscripts all over).


FWIW, if you get the clustalw source the README contains references
to the underlying algorithm papers. I try to put links in my source
code and live links ( -help foo ) in scripts that open browsers or download 
webpages
for help ( hard to maintain but ok for many things) - grepping source code
for links isn't for everyone however :)



Thanks.

Mike Marchywka
586 Saint James Walk
Marietta GA 30067-7165
404-788-1216 (C)<- leave message
989-348-4796 (P)<- emergency only
marchywka at hotmail.com
Note: Hotmail is blocking my mom's entire
ISP claiming it is to reduce spam but probably
to force users to use hotmail. Please DON'T
assume I am ignoring you and try
me on marchywka at yahoo.com if no reply
here. Thanks.





>From: "Iain Wallace" <iain.m.wallace at gmail.com>
>Reply-To: "General Forum at Bioinformatics.Org" 
><bio_bulletin_board at bioinformatics.org>
>To: "General Forum at Bioinformatics.Org" 
><bio_bulletin_board at bioinformatics.org>
>Subject: Re: [BiO BB] Observation: multiple sequence alignment affected by 
>theinput sequence order
>Date: Fri, 24 Aug 2007 11:02:06 +0100
>
>Hi,
>
>I find this behavior very strange, as the programmes are designed not to
>exhibit this behavior.
>The first step in must alignment programmes is an all against all
>comparison, from which a tree is built. This tree is then used to determine
>the order in which the sequences are aligned. There is no dependence on
>input order in any of the alignment methods mentioned.
>
>There are a few methods that can be used to compare alignments (and to make
>sure that they are identical when only the ordering is changed), such as
>aln_compare from Cedric Notredame, Q_score from Bob Edgar (
>http://www.drive5.com/) or veralign from Jaap Heringa (online server,
>http://zeus.cs.vu.nl/programs/veralignwww/)
>
>I would recommend that you redo your alignment using any of the programmes
>you mentioned, and then change the input order and then compare the two
>alignments....FYI clustal has an option to output the alignment in the 
>order
>that the sequences were aligned, and this shouldn't change regardless of 
>the
>input order.
>
>Hope this helps
>
>Iain
>
>
>
>On 8/16/07, Hongyu Zhang <forward at hongyu.org> wrote:
> >
> > Dear all,
> >
> > I've observed that several multiple sequence alignment programs, 
>including
> > ProbCons, ClustalW and Musle, all share the same behavior, i.e., given a
> > group of sequences in FASTA format as the input, if I change the order 
>of
> > the sequences in the input file, the results generated by those programs
> > will change as well. It's not just the sequence order that will change, 
>but
> > also the amino acid matches.
> >
> > I think it's a little counter-intuitive because one would expect the
> > opposite. Is there a program that can output a stable alignment 
>independent
> > of the input sequence order? Thanks!
> >
> > _______________________________________________
> > General Forum at Bioinformatics.Org -
> > BiO_Bulletin_Board at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> >
>_______________________________________________
>General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

_________________________________________________________________
Tease your brain--play Clink! Win cool prizes! 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2




More information about the BBB mailing list