[BiO BB] fastacmd - sequence retreival using "string" ?

Shameer Khadar skhadar at gmail.com
Wed Jan 31 12:43:19 EST 2007


Dear Malcom,
Thanks for such a detialed  reply !!!
I am sorry for the 'bad description' of my problem.

I am aware that fastacmd -s can search using the Accession ID (say a set of
numbers ), I am looking for an option to quickly search the nr database to
retreive sequence basesd on the "Query String".

For example : If the following is a snippet of a sequence from nr :
> gi|15674171|ref|NP_268346.1,gi|Homo Sapiens - Kinase 1
MTHSTCC.....
I  need to retrieve the above entries (and of course entries having similar)
based on a Query string say "Homo Sapiens". I know this can be done using a
Perl script, and I have coded one for myself, but I need something quick
like fastacmd -s.

Hope you got my question this time.
Thanks for all the time you spent for me !!!
-- 
Happy Bioinformatics
Across the miles... Shameer Khadar



On 1/31/07, Cook, Malcolm <MEC at stowers-institute.org> wrote:
>
> Shameer,
>
> It is unclear to me exactly what you want to do.  What exactly do you
> mean by "string query"?
>
> Does knowing that the following two command return the same result
> answer your question?:
>
> > fastacmd -s 15674171,66818355
> > fastacmd -s NP_268346.1,XP_642837.1
> > fastacmd -s 'gi|15674171|ref|NP_268346.1,gi|66818355|ref|XP_642837.1'
>
> (note: you must quote the query to prevent the shell from trying to
> interpret the '|' character as pipe operator).
>
> If this does not help you, then I'm really unsure what you're after...
>
> The options that appear relevant to your need, taken from running
> fastacmd with --help as only option, are
>
>   -s  Comma-delimited search string(s).
>       GIs, accessions, loci, or fullSeq-id strings may be used,
>       e.g. 555, AC147927, 'gnl|dbname|tag' [String]  Optional
>   -i  Input file with GIs/accessions/loci for batch
>       retrieval [String]  Optional
>   -L  Range of sequence to extract (Format: start,stop)
>       0 in 'start' refers to the beginning of the sequence
>       0 in 'stop' refers to the end of the sequence [String]  Optional
>     default = 0,0
>
> If you want to subsequences (ranges) from a bunch of different
> sequences, you must make separate calls to fastacmd.  The -L option will
> not help you for this.  The -L option only allows you to specify a
> single range.  If you use it in conjuntion with multiple comma delimited
> search strings, this single range option is applied equally to all of
> the resulting sequences.
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>
> > -----Original Message-----
> > From:
> > bio_bulletin_board-bounces+mec=stowers-institute.org at bioinform
> > atics.org
> > [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at b
> > ioinformatics.org] On Behalf Of Shameer Khadar
> > Sent: Tuesday, January 30, 2007 9:19 PM
> > To: General Forum at Bioinformatics.Org
> > Subject: [BiO BB] fastacmd - sequence retreival using "string" ?
> >
> > Dear All,
> >
> > Is it possible to retreive sequence(s) from a fastacmd nr
> > database based on
> > string qureies delimited by commas.
> > I know it is possible with the Accession IDs, Is there any
> > way to do it for
> > the string query.
> >
> > Thanks,
> > Shameer
> > _______________________________________________
> > General Forum at Bioinformatics.Org -
> > BiO_Bulletin_Board at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> >
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>



More information about the BBB mailing list