[BiO BB] genbank orthography

Mike Marchywka marchywka at hotmail.com
Wed Oct 24 10:25:25 EDT 2007

I deleted most of the posts on this thread but as with other thread
if you can reduce the db to text, there are plenty of good tools for
one-time text processing- this is easy with sed and perl. There are
indexing scripts on the web that are only 10-20 lines long. It isn't hard
to find typos in such a list, with or without a spelling dictionary.

>From: Sterten at aol.com
>Reply-To: "General Forum at Bioinformatics.Org" 
><bio_bulletin_board at bioinformatics.org>
>To: bio_bulletin_board at bioinformatics.org
>Subject: [BiO BB] genbank orthography
>Date: Wed, 24 Oct 2007 03:26:09 EDT
>names are not spelled uniformly, e.g. Viet Nam and Vietnam,
>also many typos, this makes it very difficult to sort and analyse the  
>by computer.
>I'm looking for a complete list of different spellings
>(thousands of entries...) and the suggested standard so we can
>correct/uniformify them automatically.
>General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org

i'm making a difference. Make every IM count for the cause of your choice. 
Join Now. http://im.live.com/messenger/im/home/?source=TAGHM

More information about the BBB mailing list