[BiO BB] Getting PDB id from Swissprot entry

Thu Jul 1 12:15:21 EDT 2004

Hi,

I do not know for which particular purpose you are looking for SP <--> 
PDB mappings, but beware the following perils & pitfalls:

1) The SP <--> PDB mapping can be many-to-many.

1.1 )There may be several entries in PDB which correspond to a single 
swissprot entry. This is because the same protein may have been 
structurally solved by different groups at different times, solved with 
different ligands, point mutated, and so forth. Be very careful about 
point-mutations: they do not have the same sequence in PDB as in SP.

1.2) At the same time, there may be several SP entries corresponding to 
a single PDB entry. This may be due to SP redundancy (although database 
curators are doing a fantastic job of keeping that down), close 
homologs, or point mutations.

2) An SP amino-acid sequence is rarely the same as the PDB sequence. 
Usually only part of a structure is solved. Gaps abound, because 
crystallographers sometimes cannot see the loops. There are large 
deletions, because there are bits which are not crystallizable, or, if 
NMR, they are trying to keep the protein short.

3) There are many SP entries which do have an equivalent in PDB, but it 
does not say so in DR or PDB. See also the "40%" comment below.

Cheers,

Iddo

Sourangshu Bhattacharya wrote:
> Hi Dan,
> Thank you very much. I didn't know about MSD.
> 
> There is also an entry HSSP in swissprot which gives homologues.
> 
> Sourangshu
> 
> Dan Bolser wrote:
> 
>> On Wed, 30 Jun 2004, Sourangshu Bhattacharya wrote:
>>
>>  
>>
>>> Hi,
>>>
>>> Is there a direct way (without reading the protein name from 
>>> swissprot and searching in PDB) of getting the PDB id of the protein 
>>> corresponding to a particular Swissprot id ?
>>>   
>>
>>
>> I would use the MSD database, which maintains a manually curated version
>> of the SwissProt to PDB mapping.
>>
>>  
>>
>>> Also, how do I know whether structure for a particular protein 
>>> corresponding to a swissprot id has been determined or not ?
>>>   
>>
>>
>> Strictly speeking, the above mapping gives you this. More realistically,
>> however, you can consider very close homologues to the above set as also
>> 'solved'. Where you draw the line is a matter of requirement, but you can
>> get reasonable models (allegedly) at > 40% sequence identity, or
>> reasonable 'fold prediction' at much larger distances (see SUPERFAMILY 
>> for
>> example).
>>
>> It all depends on what you want to do.
>>
>>  
>>
>>> Thank you very much..
>>>
>>> Regards,
>>> Sourangshu.
>>>
>>>
>>>   
>>
>>
>> _______________________________________________
>> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>  
>>
> 

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9930
http://ffas.ljcrf.edu/~iddo