Bioinformatics.org
[University of Birmingham]
Not logged in
  • Log in
  • Bioinformatics.org
    Membership (44963+) Group hosting [?] Wiki
    Franklin Award
    Sponsorships

    Careers
    About bioinformatics
    Bioinformatics jobs

    Research
    All information groups
    Online databases Online analysis tools Online education tools More tools

    Development
    All software groups
    FTP repository
    SVN & CVS repositories [?]
    Mailing lists

    Forums
    News & Commentary
  • Submit
  • Archives
  • Subscribe

  • Jobs Forum
    (Career Center)
  • Submit
  • Archives
  • Subscribe
  • BIRCH: Comprehensive bioinfo. system - Support tickets

    Submit | Open tickets | Closed tickets

    [ Ticket #1190 ] BioLegato can not read GenBank CONTIG entries
    Date:
    10/15/13 17:11
    Submitted by:
    B_Fristensky
    Assigned to:
    Alvare
    Category:
    bioLegato
    Priority:
    5
    Ticket group:
    Bug
    Resolution:
    Resolved
    Summary:
    BioLegato can not read GenBank CONTIG entries
    Original submission:
    GenBank entries in the CON division contain CONTIG fields, rather than sequences. Where the ORIGIN line would normally be, followed by sequence we lines such as:

    /translation="SGAFKSLISSAFVSWKTTGKLQQTVRDSVERTGRGLHSGEISTV
    KILPAAARFGRRFLFRSTVIAASIDNVVKETPLCTTLSKDGCTIRTVEHLLSALEASG
    VDNCLIEIAGSGDCQRSIEV"
    CONTIG join(AUSU01008588.1:1..7393)
    //

    Entries such as this are saved by NCBI Entrez if the 'Genbank' option is chosen, and the complete sequence is saved if 'Genbank (full) is chosen.

    Currently, BioLegato will stop reading a GenBank entry upon the first instance of a CONTIG entry, and fails to read the remaining entry. BioLegato should be able to at least read such files as empty sequences. (Even better would be if the Open function could go to NCBI and get the full entry, but that is too complicated.) At the very least, BioLegato
    should always terminate reading the current sequence
    upon encountering the // line, and begin a new sequence at that point.

    Examples of GenBank entries with CONTIGs instead of sequences include accession numbers KE535216, NW_003763341, and GL982899.

    So the fix should be to read a 0 length sequence when encountering CONTIGs.



    Please log in to add comments and receive followups via email.
    Followups
    Comment Date By
    http://www.biostars.org/p/9869/ 10/21/13 11:25 Alvare
    No results for "Dependent on ticket"
    No results for "Dependent on Task"
    No other tickets are dependent on this ticket
    Ticket change history
    Field Old value Date By
    status_id Open 10/30/13 15:18 B_Fristensky
    resolution_id Not Resolved 10/30/13 15:18 B_Fristensky
    close_date 12/31/69 19:00 10/30/13 15:18 B_Fristensky
    status_id Pending 10/15/13 17:12 B_Fristensky
    assigned_to unset 10/15/13 17:12 B_Fristensky
    resolution_id Unset 10/15/13 17:12 B_Fristensky

     

    Copyright © 2024 Scilico, LLC · Privacy Policy