[BiO BB] Comparing sequences from GenBank and RefSeq...
Dan Bolser
dan.bolser at gmail.com
Tue Apr 28 08:50:32 EDT 2009
2009/4/23 Ryan Raaum <ryan.raaum at gmail.com>:
> The refseq entry tells you which non-refseq entry/entries it was
> derived from. In this case it says DQ386163, which suggests there are
> at least 2 pototo chloroplast sequences available - one by an Italian
> group and one by a Korean group.
Right I see. Any way to judge the quality of the two?
In the RefSeq record I read "PROVISIONAL REFSEQ: This record has not
yet been subject to final NCBI review." - Anyway to kick them about
that?
i.e. Dear RefSeq, I have DQ231562 and DQ386163, should they be merged
into NC_008096?
Thanks for the info,
Dan.
> On Thu, Apr 23, 2009 at 11:42 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
>> Hi,
>>
>> I found that the potato chloroplast sequence from GenBank (DQ231562.1)
>> has several differences (260 SNPs and 30 indels) relative to the same
>> sequence in RefSeq (NC_008096.1). As far as I am aware this sequence
>> has only been obtained once, why would the two differ? In general
>> should I trust the refseq sequence?
>>
>>
>> For your reference here is the output of dnadiff over the two files:
>>
>> Reference/DQ231562.fasta Query/NC_008096.fasta
>> NUCMER
>>
>> [REF] [QRY]
>> [Sequences]
>> TotalSeqs 1 1
>> AlignedSeqs 1(100.00%) 1(100.00%)
>> UnalignedSeqs 0(0.00%) 0(0.00%)
>>
>> [Bases]
>> TotalBases 155312 155298
>> AlignedBases 155312(100.00%) 155298(100.00%)
>> UnalignedBases 0(0.00%) 0(0.00%)
>>
>> [Alignments]
>> 1-to-1 1 1
>> TotalLength 155312 155298
>> AvgLength 155312.00 155298.00
>> AvgIdentity 99.81 99.81
>>
>> M-to-M 1 1
>> TotalLength 155312 155298
>> AvgLength 155312.00 155298.00
>> AvgIdentity 99.81 99.81
>>
>> [Feature Estimates]
>> Breakpoints 0 0
>> Relocations 0 0
>> Translocations 0 0
>> Inversions 0 0
>>
>> Insertions 0 0
>> InsertionSum 0 0
>> InsertionAvg 0.00 0.00
>>
>> TandemIns 0 0
>> TandemInsSum 0 0
>> TandemInsAvg 0.00 0.00
>>
>> [SNPs]
>> TotalSNPs 260 260
>> AC 23(8.85%) 14(5.38%)
>> AG 24(9.23%) 30(11.54%)
>> AT 15(5.77%) 14(5.38%)
>> CA 14(5.38%) 23(8.85%)
>> CG 24(9.23%) 18(6.92%)
>> CT 32(12.31%) 19(7.31%)
>> GA 30(11.54%) 24(9.23%)
>> GC 18(6.92%) 24(9.23%)
>> GT 13(5.00%) 34(13.08%)
>> TA 14(5.38%) 15(5.77%)
>> TC 19(7.31%) 32(12.31%)
>> TG 34(13.08%) 13(5.00%)
>>
>> TotalGSNPs 113 113
>> AC 9(7.96%) 8(7.08%)
>> AG 17(15.04%) 17(15.04%)
>> AT 5(4.42%) 3(2.65%)
>> CA 8(7.08%) 9(7.96%)
>> CG 6(5.31%) 7(6.19%)
>> CT 15(13.27%) 8(7.08%)
>> GA 17(15.04%) 17(15.04%)
>> GC 7(6.19%) 6(5.31%)
>> GT 6(5.31%) 12(10.62%)
>> TA 3(2.65%) 5(4.42%)
>> TC 8(7.08%) 15(13.27%)
>> TG 12(10.62%) 6(5.31%)
>>
>> TotalIndels 30 30
>> A. 14(46.67%) 4(13.33%)
>> C. 1(3.33%) 0(0.00%)
>> G. 0(0.00%) 0(0.00%)
>> T. 7(23.33%) 4(13.33%)
>>
>> TotalGIndels 24 24
>> A. 10(41.67%) 4(16.67%)
>> C. 1(4.17%) 0(0.00%)
>> G. 0(0.00%) 0(0.00%)
>> T. 5(20.83%) 4(16.67%)
>>
>>
>> Thanks for any pointers,
>> Dan.
>>
>> _______________________________________________
>> BBB mailing list
>> BBB at bioinformatics.org
>> http://www.bioinformatics.org/mailman/listinfo/bbb
>>
>
>
>
> --
> Ryan Raaum
> Assistant Professor
> Department of Anthropology
> Lehman College, The City University of New York
> 250 Bedford Park Blvd. West
> Bronx, NY 10468
> e: ryan.raaum at lehman.cuny.edu
> w: http://www.raaum.org
> o: (718) 960-8845
> f: (718) 960-8406
> _______________________________________________
> BBB mailing list
> BBB at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/bbb
>
More information about the BBB
mailing list