[BiO BB] program for sequence length

Larye D. Parkins larye at info-engineering-svc.com
Thu May 6 21:23:14 EDT 2010


On Thu, May 6, 2010 6:07 pm, Mike Marchywka wrote:
>
> ----------------------------------------
>> Date: Wed, 5 May 2010 14:46:28 +0530
>> From: pkhurana08 at gmail.com
>> To: bbb at bioinformatics.org
>> Subject: [BiO BB] program for sequence length
>>
>> Hi all,
>>
>> I have a few 1000 fasta files. I would like to get the list showing the
>> sequence name and their respective lengths.
>> Is there a program for this?
>

infoseq, part of the EMBOSS suite should do what you want.

> You could probably write a perl or bash script to do it more quickly
> than you could find something and depending on your overall objective,
> assuming you want to do more, it may help to have something
> in source code that you understand. I was doing a lot of fasta
> manipulation
> and I ended up writing a c++ fasta command line utility since I needed
> speed  but I never documented it and keep forgetting how it works.
>
> Consider just using sed to put the name and sequence into a single line
> per entry
> and then just look at lengths using awk or something. For things
> I don't do very often the learning curve can be a nuisance and it is
> easier
> to "Reinvent the wheel" with a short script rather than relearn some
> special purpose utility.
> These generalpurpose text processing tools can be used anywhere.
>
>
>
>> I can write one but why reinvent the wheel.
>> Thanking all in advance
>>
>> Regards,
>> Pankaj
>> _______________________________________________
>> BBB mailing list
>> BBB at bioinformatics.org
>> http://www.bioinformatics.org/mailman/listinfo/bbb
>
> _________________________________________________________________
> Hotmail has tools for the New Busy. Search, chat and e-mail from your
> inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
> _______________________________________________
> BBB mailing list
> BBB at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/bbb
>
>
>


-- 
Larye D. Parkins
Information Engineering Services
600 Turner Ave.
Shelton, WA 98584
Office: 360 426 1718
Mobile: 360 350 9645
http://www.info-engineering-svc.com

"Making IT work since 1965."
Member of ACM, IEEE Computer Society, USENIX, SAGE, and LOPSA





More information about the BBB mailing list