[BiO BB] program for sequence length

Mike Marchywka marchywka at hotmail.com
Thu May 6 21:07:22 EDT 2010










----------------------------------------
> Date: Wed, 5 May 2010 14:46:28 +0530
> From: pkhurana08 at gmail.com
> To: bbb at bioinformatics.org
> Subject: [BiO BB] program for sequence length
>
> Hi all,
>
> I have a few 1000 fasta files. I would like to get the list showing the
> sequence name and their respective lengths.
> Is there a program for this?

You could probably write a perl or bash script to do it more quickly
than you could find something and depending on your overall objective,
assuming you want to do more, it may help to have something
in source code that you understand. I was doing a lot of fasta manipulation
and I ended up writing a c++ fasta command line utility since I needed
speed  but I never documented it and keep forgetting how it works.

Consider just using sed to put the name and sequence into a single line per entry
and then just look at lengths using awk or something. For things
I don't do very often the learning curve can be a nuisance and it is easier
to "Reinvent the wheel" with a short script rather than relearn some special purpose utility.
These generalpurpose text processing tools can be used anywhere.



> I can write one but why reinvent the wheel.
> Thanking all in advance
>
> Regards,
> Pankaj
> _______________________________________________
> BBB mailing list
> BBB at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/bbb
 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1



More information about the BBB mailing list