[BiO BB] Re: Extracting location from a genbank flatfile

Daniel Ducat daniel.ducat at metalife.de
Thu Apr 10 03:44:38 EDT 2003

Hello Govind

We had the same problem with Genbank locations.
What we did here is to write a C++ program that parse
a Genbank entry file into flatfiles, ready to be imported into a relational
In the database(MSSQL) we wrote a stored procedure, that parse every
location, notwithstanding how complicated is it , and break it into a set
of smaller simple ones. Note, that a location can have a link to other
entry, for
example (100..130, A01234.12..15).

It look complicated, but in a such a way we get all we need.

For more simple solution write a program or perl script (or bash script)
that extracts
the location from of the feature and parse it. This task is not so difficult
as it seems,
since there are clear rules for feature table location positions in Genbank
entry files.


Daniel Ducat
Senior Database Developer Metalife AG
e-mail: daniel.ducat at metalife.de
Phone: +359 (02) 950-18-04
URL: http://www.metalife.de

-----Original Message-----
From: bio_bulletin_board-admin at bioinformatics.org
[mailto:bio_bulletin_board-admin at bioinformatics.org]On Behalf Of govind mk
Sent: Thursday, April 10, 2003 5:44 AM
To: bio_bulletin_board at bioinformatics.org
Subject: [BiO BB] Re: Extracting location from a genbank flatfile

Hi all

I am stuck with a rather simple problem.
I would like to extract locations of specific features
(Eg .CDS)from a Genbank flat file.

I tried using Bioperl but couldnt manage to get the
exact locations for complicated representations of
locations such as
as Bioperl modules return the minimum start and
maximum stop.

Any suggestions ???


Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org

More information about the BBB mailing list