[BiO BB] atomic coordinates of the proteins to predict protein tertiary structure using adaptive neuro-fuzzy systems

Mon Dec 25 05:41:18 EST 2006

Sub: atomic coordinates of the proteins to predict protein tertiary
structure using adaptive neuro-fuzzy systems

We are trying to predict protein tertiary structure using adaptive
neuro-fuzzy systems. This method comes in the category of homology modeling,
since we are using homologous proteins for prediction.

We are facing a few problems in deriving the training data.

As an example, consider a situation where I use 2 proteins (PDB Ids: 1MYJ
and 1YMB) to predict the structure of the protein 'Human Myoglobin' (PDB Id:
2MM1).

I am attaching the co-ordinates of the first alpha carbons herewith.

For 1MYJ, the first few alpha carbon have atomic co-ordinates as follows:

GLY

48.763

-12.39

28.42

LEU

52.541

-11.96

28.764

SER

54.79

-13.074

25.979

ASP

56.15

-16.567

25.988

GLY

59.289

-14.893

27.105

GLU

57.708

-13.153

30.06

TRP

56.174

-16.476

31.235

GLN

59.566

-17.977

31.117

For 1YMB, the co-ordinates of the first few alpha carbon atoms are as
follows:

GLY

-3.12

15.454

14.959

LEU

-0.482

14.622

17.641

SER

-1.126

13.415

21.177

ASP

0.851

14.53

24.209

GLY

2.401

11.015

24.161

GLU

3.737

11.867

20.724

TRP

4.724

15.486

21.478

GLN

6.75

14.065

24.424

For 2MM1, the atomic co-ordinates of the first few atomic coordinates are as
follows:

GLY

-4.704

17.705

14.942

LEU

-1.187

18.608

16.091

SER

-0.042

17.772

19.588

ASP

0.651

20.497

22.117

GLY

4.297

19.54

21.663

GLU

3.879

19.913

17.893

TRP

2.244

23.408

18.097

GLN

5.159

24.505

20.293

You can notice that even though the amino acids are the same, their atomic
co-ordinates are completely different. This is not because of the amino
acids which follows in the sequence, since all these 3 proteins have very
similar sequence.

I infer this is because, the atomic co-ordinates may not absolute and cannot
be taken as such for training. We faced this issue while using this data for
training neural networks, since neural networks won't know *which
co-ordinates* to predict.

Can you suggest any means by which we can get the atomic coordinates of the
proteins used for training, after alignment?

That means, all the proteins used for training, should be aligned first and
then, we need to get the co-ordinates so that we can use those data for
training.

Our understanding is there are many tools for aligning the alpha carbon
atoms of proteins, but they do not give the atomic coordinates after
alignment."

Thanks and regards,

Srijith V. M.