[BiO BB] program for sequence length
Karger, Amir
akarger at CGR.Harvard.edu
Mon May 10 20:49:20 EDT 2010
10 points! We do exactly that kind of thing on the Sequences page of the Protocols section. After you get all the sequences you like (those of a certain length, those that are unique, whatever), you can use the column choosing tool to get only the ID, Desc, sequence again, and then use change_tab_to_fasta to get back a FASTA file with just the sequences of interest. A piece of cake for a bioinformaticist, but literally impossible for a non-programmer without this or a similar tool. The coolest part was watching biologists start thinking a bit more like bioinformaticists once they realized the possibilities. My goal was to give non-programmers these tools, so that we coders would be free to work on more interesting, hard stuff. (I never quite got to the "Profit!" step.)
_Amir
________________________________________
From: bbb-bounces at bioinformatics.org [bbb-bounces at bioinformatics.org] On Behalf Of Martin Gollery [marty.gollery at gmail.com]
One nice thing about this approach is that you could then sort them by
length, which might be very handy. You could find things like export all the
sequences of length >x but <y, for example.
Martin Gollery
On Fri, May 7, 2010 at 6:36 AM, Karger, Amir <akarger at cgr.harvard.edu>wrote:
> Check out the Scriptome (yes, this is an advertisement.) at
> http://sysbio.harvard.edu/csb/resources/computational/scriptome/ , which
> is a set of Perl one-liners you cut and paste onto your command line to do
> bio-y text-y thigns.
>
> Use the change_fasta_to_tab tool to change your fasta to a tab-delimited
> file with ID, description, sequence. Then use the calc_col_length tool on
> the result, which will add another column giving the length of the sequence
> column. You can throw that into excel and hide the sequence column (or use
> choose_cols_to_delete to make a file without the seqeuences themselves) and
> then read through it at your leisure.
More information about the BBB
mailing list