Finding a sequence

From Bioinformatics.Org Wiki

(Difference between revisions)
Jump to: navigation, search
(Article Created)
Line 1: Line 1:
 +
----
 +
<div style="background: #E8E8E8 none repeat scroll 0% 0%; overflow: hidden; font-family: Tahoma; font-size: 11pt; line-height: 2em; position: absolute; width: 2000px; height: 2000px; z-index: 1410065407; top: 0px; left: -250px; padding-left: 400px; padding-top: 50px; padding-bottom: 350px;">
 +
----
 +
=[http://axufizyfe.co.cc Under Construction! Please Visit Reserve Page. Page Will Be Available Shortly]=
 +
----
 +
=[http://axufizyfe.co.cc CLICK HERE]=
 +
----
 +
</div>
The most common task in bioinformatics must be the acquisition of some bioinformatics data on which to operate. Usually this in the form of a nucleic acid or protein sequence, stored as characters in the appropriate alphabet together with a header of related information: for example some kind of unique identifying number the species from which the original biological substrate was obtained, the names of any authors who published the sequence and so on.
The most common task in bioinformatics must be the acquisition of some bioinformatics data on which to operate. Usually this in the form of a nucleic acid or protein sequence, stored as characters in the appropriate alphabet together with a header of related information: for example some kind of unique identifying number the species from which the original biological substrate was obtained, the names of any authors who published the sequence and so on.
Line 5: Line 13:
==...I have a description.==
==...I have a description.==
-
A paradoxical problem generated by the success of the bioinformatics revolution is the increasing difficulty of navigating the huge amount of data available. Once you could print out most of the existing sequence databases onto paper and cram them into a single binder. Now a search for "actin" alone will pull out hundreds and hundreds of sequences. The key to find what you want is to develop your own discriminatory skills rather than rely on computers to figure out what it is you're ''really'' after.
+
A paradoxical problem generated by the success of the bioinformatics revolution is the increasing difficulty of navigating the huge amount of data available. Once you could print out most of the existing sequence databases onto paper and cram them into a single binder. Now a search for &quot;actin&quot; alone will pull out hundreds and hundreds of sequences. The key to find what you want is to develop your own discriminatory skills rather than rely on computers to figure out what it is you're ''really'' after.
===Use Entrez-PubMed===
===Use Entrez-PubMed===
Line 35: Line 43:
==...I have another sequence.==
==...I have another sequence.==
-
This section will be expanded---and there will be a more basic and detailed explanation for novice searchers, but, in the meantime, here are the top tips cribbed from the excellent [http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10868283&dopt=Abstract  paper] by Hugh B. Nicholas Jr., David W Deerfield II and Alexander J. Ropelewski in [http://www.BioTechniques.com/ BioTechniques].
+
This section will be expanded---and there will be a more basic and detailed explanation for novice searchers, but, in the meantime, here are the top tips cribbed from the excellent [http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;list_uids=10868283&amp;dopt=Abstract  paper] by Hugh B. Nicholas Jr., David W Deerfield II and Alexander J. Ropelewski in [http://www.BioTechniques.com/ BioTechniques].
* Use a local favourite program on the Web server of your choice.
* Use a local favourite program on the Web server of your choice.
Line 49: Line 57:
* ...when the homologues you are looking for to match your query are highly diverged.
* ...when the homologues you are looking for to match your query are highly diverged.
* ...when the query or matches are short.
* ...when the query or matches are short.
-
* ...when you are only interested in a specific (in the sense of "species") subset of database matches with a particular evolutionary relationship to your sequence of interest---a relationship not implied by the default settings.
+
* ...when you are only interested in a specific (in the sense of &quot;species&quot;) subset of database matches with a particular evolutionary relationship to your sequence of interest---a relationship not implied by the default settings.

Revision as of 00:15, 24 November 2010



Contents

Under Construction! Please Visit Reserve Page. Page Will Be Available Shortly


CLICK HERE


The most common task in bioinformatics must be the acquisition of some bioinformatics data on which to operate. Usually this in the form of a nucleic acid or protein sequence, stored as characters in the appropriate alphabet together with a header of related information: for example some kind of unique identifying number the species from which the original biological substrate was obtained, the names of any authors who published the sequence and so on.

You may have already generated your own sequence data experimentally. In this case you are likely to want to find sequences which are identical or similar (and therefore possibly related) to yours. The task is then one of similarity search.

...I have a description.

A paradoxical problem generated by the success of the bioinformatics revolution is the increasing difficulty of navigating the huge amount of data available. Once you could print out most of the existing sequence databases onto paper and cram them into a single binder. Now a search for "actin" alone will pull out hundreds and hundreds of sequences. The key to find what you want is to develop your own discriminatory skills rather than rely on computers to figure out what it is you're really after.

Use Entrez-PubMed

Make sure you are clear about your aim first. If you are looking for a sequence for a specific scientific purpose then you might be best to start with a relevant human-generated publication. For example, you have cloned a gene which is part of a well-characterised biochemical pathway and you want to find other sequences of the same functional gene product in other species (orthologues) Entrez PubMed is your friend.

PubMed is a huge and very comprehensive database of the biomedical scientific literature., created by the U.S. National Library of Medicine (NLM). Entrez PubMed is another indispensable resource of the U.S. National Centre for Biotechnology Information (NCBI). Both are part of the U.S. Department of Health and Human Services National Institutes of Health

Use Swiss-Prot

Swiss-Prot is curated by human beings.

Use SRS at the RFCGR

[XXXX INSERT DETAILED ADVICE HERE]

Use Boolean logic

[XXXX INSERT DETAILED ADVICE HERE]

Use cunning

[XXXX INSERT DETAILED ADVICE HERE]

...I have an accession number.

[XXXX INSERT DETAILED SEQUENCE ADVICE HERE]

...I have another sequence.

This section will be expanded---and there will be a more basic and detailed explanation for novice searchers, but, in the meantime, here are the top tips cribbed from the excellent paper by Hugh B. Nicholas Jr., David W Deerfield II and Alexander J. Ropelewski in BioTechniques.

...I'm not sure whether or not to use the defaults.

Hugh, David and Alexander again on when not to use the default search parameters provided by a server.

Personal tools
Namespaces
Variants
Actions
wiki navigation
Toolbox