[BiO BB] difference between Refseq and Uniprot

Frank Lee lifei03 at gmail.com
Mon Jun 27 07:09:08 EDT 2005


 From the gpff.gz file in Refseq database, transcrpit varaiants of genes 
are given as independant items. I am giving an example of transcrpits 
annotation in RefSeq database here

LOCUS       NP_739577                433 aa            linear   PRI 
27-OCT-2004

...................
  Transcript Variant: This variant (2) lacks an in-frame segment of
            the coding region, compared to variant 1. It encodes a shorter
            isoform (2), that is missing an internal segment compared to
            isoform 1.
................


-Frank

Dan Bolser wrote:

>When it comes to redundancy its good to be specific about what you mean, biological
>redundancy or sequence redundancy? I know refseq tries to specifically remove
>biological redundancy (collecting together duplicate copies of the same genes). The
>sequence redundancy is removed in uniprot in the uniparc sequence database.
>
>Also SwissProt keeps spice variant information implicit in comment lines which you
>can parse to get the variant sequences (this is what they do in uniparc).
>
>How does Refseq deal with splice variants / sequence redundancy?
>
>Has anyone done large scale comparison of the two databases?
>
>Cheers,
>Dan.
>
>
>++ Stefanie Lager--
>  
>
>>It's redundancy both in Uniprot and In Refseq. Refseq contain splice
>>variants but no fragments, it has good coverage. While SwissProt doesn't
>>contain fragments or spice variants, but the coverage isn't quite as
>>good. But TrEMBL does contain fragments and  splice variants. For a few
>>species the IPI database could be an alternative
>>http://www.ebi.ac.uk/IPI/IPIhelp.html  . It's a mergeer of UniProt,
>>RefSeq and Ensembl. IPI has good coverage, and it contains splice
>>variants, but few fragments.
>>
>>Stefanie
>>
>>    
>>
>>>Hi, all,
>>>
>>>Recently,  I am working on the protein sequence analysis.  I found the
>>>Refseq is very different from Uniprot.  Uniprot contains much more
>>>proteins if TrEMBL is included.  Can any experts give some comments on
>>>these two protein database.  Which one is more reliable?   It seems
>>>Swiss is of the best quality, then Refseq, then TrEMBL.   And also, It
>>>seems there are redundancy in Uniprot.  Is that so?
>>>
>>>-Frank
>>>
>>>
>>>_______________________________________________
>>>Bioinformatics.Org general forum  -
>>>BiO_Bulletin_Board at bioinformatics.org
>>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>>      
>>>
>>_________________________________________________________________
>>    http://fastmail.ca/ - Fast Secure Web Email for Canadians
>>
>>_______________________________________________
>>Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>>    
>>
>
>
>_______________________________________________
>Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>  
>




More information about the BBB mailing list