[BiO BB] Re: Extracting location from a genbank flatfile
Peter Rice
pmr at ebi.ac.uk
Thu Apr 10 05:50:04 EDT 2003
govind mk wrote:
> I am stuck with a rather simple problem.
> I would like to extract locations of specific features
> (Eg .CDS)from a Genbank flat file.
>
> I tried using Bioperl but couldnt manage to get the
> exact locations for complicated representations of
> locations such as
> complement(join(295405..295443,295492..295529))
> as Bioperl modules return the minimum start and
> maximum stop.
You can use EMBOSS (the European Molecular Biology Open Software Suite)
http://www.uk.embnet.org/Software/EMBOSS/
EMBOSS is an open source (GPL/LGPL) package of sequence analysis
libraries and programs.
Among other features, EMBOSS can read EMBL/Genbank, SwissProt and PIR
feature tables and convert to/from GFF without losing information
(although this does require adding some extra GFF tags to retain
information about complex feature locations). The internals are similar
to the ARTEMIS feature table editor from the Sanger Institute.
I am currently extending the feature table internals of EMBOSS for the
next release, to allow deletion/insertion of sequence ranges, and would
be interested in any feedback - especially things that are hard to do
with existing tools.
regards,
Peter Rice
European Bioinformatics Institute.
More information about the BBB
mailing list