[BiO BB] Need fair alignment tool comparison/ using DSCAM for tool testing

Mike Marchywka marchywka at hotmail.com
Tue Feb 19 15:12:09 EST 2008


> So, ok 67+25=92 seconds is not real impressive compared to 17, and I'm not sure how
> much I can blame cygwin for this :) I guess once I'm sure I have a useful algorithm,
> I can subtract IO time which has been significant in many cases.

I wasn't going to bother to look given the time differences are> 4x but I did note they tested on
a 3Ghz Pentium 4 and I have something that comes up as "x86 Family 6 Model 8 Stepping 3"
which is probably ca. 1 Ghz ( I never bothered to check since I thought a 2-3x factor wasn't
important). I guess by the time you subtract IO it may be pretty close. It would
be hard to blame cygwin for the computational time however :)




> From: marchywka at hotmail.com
> To: bbb at bioinformatics.org; larye at info-engineering-svc.com
> Date: Tue, 19 Feb 2008 11:51:39 -0500
> Subject: Re: [BiO BB] Need fair alignment tool comparison/ using DSCAM for tool testing
>
>
>> We have been using MUMmer3 (http://mummer.sourceforge.net) for rapid
>> alignments of whole genomes, genomes and contigs, and searching for
>
> Thanks- that looks like a good tool that I didn't know about. I noticed they advertize e coli results
> prompting me to go back and check my own. I'd have to go check the suffix tree literature
> to see what exactly they claim to do in 17 seconds on e coli, but under cygwin, I was able to
> index all matching strings of length 25 or more, in about 67 seconds ,
>
> $ date;$progpath/string_test -fastas both_fasta -index 8 -length 25 -fix 12 -output 3 -filterN -filterID -status -fcompare_all> anchors ;date
> Sat Nov 10 18:45:23 EST 2007
> string_test.cpp177 loaded 2 fastas
> Sat Nov 10 18:46:30 EST 2007
>
>
> and create a coarse alignment in another 25 seconds,
>
> $ date; $progpath/mm_align_tool -fastas both_fasta -v -pair_rules anchors -doall -pair_align 0 -output text> align1 ;date
> Sat Nov 10 18:50:01 EST 2007
> mm_hit_classes.h389
> annotation_model.h57 Loaded 33373 pair rules.
> mm_align_tool.cpp309 Doing string PAIR align with cutoff 3
> mm_align_tool.h227 do_all with only one rule, did you mean -mrules?
> mm_align_tool.cpp318 doing 0 vs 1
> mm_align_tool.cpp326 do hit dump rules
> Sat Nov 10 18:50:26 EST 2007
>
>
> Do you have actual timing tests for various complete tasks or is 17 seconds about it?
> So, ok 67+25=92 seconds is not real impressive compared to 17, and I'm not sure how
> much I can blame cygwin for this :) I guess once I'm sure I have a useful algorithm,
> I can subtract IO time which has been significant in many cases.
> Someone also privately suggested blast's bl2seq and I would point out that this is quite fast on pairs
> of 50k sequences.
>
>
>
>
> Mike Marchywka
> 586 Saint James Walk
> Marietta GA 30067-7165
> 404-788-1216 (C)<- leave message
> 989-348-4796 (P)<- emergency only
> marchywka at hotmail.com
> Note: Hotmail is blocking my mom's entire
> ISP claiming it is to reduce spam but probably
> to force users to use hotmail. Please DON'T
> assume I am ignoring you and try
> me on marchywka at yahoo.com if no reply
> here. Thanks.
>
>
> _________________________________________________________________
> Shed those extra pounds with MSN and The Biggest Loser!
> http://biggestloser.msn.com/
> _______________________________________________
> BBB mailing list
> BBB at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/bbb

_________________________________________________________________
Helping your favorite cause is as easy as instant messaging. You IM, we give.
http://im.live.com/Messenger/IM/Home/?source=text_hotmail_join



More information about the BBB mailing list