[BiO BB] Obtaining lineage information from an NCBI taxId

James Wagner jrwagner at sfu.ca
Thu Sep 27 16:44:18 EDT 2007


Hello, I was just trying to obtain the full phylogenetic lineage from a
given NCBI taxonomy ID using BioPerl. What i am discovering is that some of
these ids are missing names at certain levels. For example, for ID 1166 at
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=1128&lvl=3&keep=1&srchmode=1&unlock&lin=s

the lineage
Bacteria<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=2&lvl=3&keep=1&srchmode=1&unlock>;
Cyanobacteria<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=1117&lvl=3&keep=1&srchmode=1&unlock>;
Chroococcales<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=1118&lvl=3&keep=1&srchmode=1&unlock>;
Microcystis<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=1125&lvl=3&keep=1&srchmode=1&unlock>is
obtained

from the tooltips one can see that this is information for the Kingdom,
Phylum, Order, and Genus respectively, but Family and Class are missing.
While I can get this lineage from BioPerl, I cannot figure out how to find
out specifically that Family and Class are missing, and I was wondering if
there was some way to script NCBI (or anywhere else) to retrieve this
without resorting to screen scraping, as these tool-tips are the only place
that I can seem to find this information. Or is there some sort of rule in
bacterial taxonomy that I can apply to make this easier?
Thanks,

James



More information about the BBB mailing list