[ssml] Re: Sequence Idenity Matrix

Mensur Dlakic mdlakic at montana.edu
Thu Jul 21 11:45:32 EDT 2005


I am not sure I have the right answer for you, but it's worth a try. There 
is a program called al2co (see the links below)
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=11524371
ftp://iole.swmed.edu/pub/al2co/

that will take an alignment in ClustalW format, like this:

Q8XJP5_1:75        MKELVEIIAKSLVDKPEDVHVNEVLGEESIILELKVSPEDMGKVIGKQGR
Q8R9X2_3:75        --ELVKTIAKALVDNPDAVEVNEIHGHQSIIIELKVAPEDMGKVIGKQGR
Q97I96_1:75        MKQLLETIAKSLVDCPDEVQVSEVTGEQSIILELKVAPEDMGKVIGKQGR
UPI000039B24B_1:75 MKELLITLAKALVDHPDQVSVNQIEGEKSVILELRVAQEDMGKVIGKQGR
Q6MEB8_1:75        MKEFVAYIVKNLVDHPDKVKINEIGGTQTLIIELSVEKSDIGKIIGKKGK
Q9Z7I5_1:75        MKEFLAYIIKNLVDRPEEVRIKEVQGTHTIIYELSVAKPDIGKIIGKEGR

Q8XJP5_1:75        IAKAIRTVVKAAAIKENKKVVVEII
Q8R9X2_3:75        IAQAIRTLVKAAALKEKKRVIVEII
Q97I96_1:75        IAKAIRTVIKAAAVKENKRVVVEII
UPI000039B24B_1:75 IARAIRTLVKAAAAHEGKRVVVEII
Q6MEB8_1:75        TINAIRTLLMSVASRNGIRVNLEIL
Q9Z7I5_1:75        TIKAIRTLLVSVASRNNVRVSLEIM

and output conservation along the sequence (in CSV line, on 0-9 scale), 
like this:

CSV: 
996662364939991961906466093264939939434969969994966629999634
Q8XJP5_1:75 
MKELVEIIAKSLVDKPEDVHVNEVLGEESIILELKVSPEDMGKVIGKQGRIAKAIRTVVK
Q8R9X2_3:75 
--ELVKTIAKALVDNPDAVEVNEIHGHQSIIIELKVAPEDMGKVIGKQGRIAQAIRTLVK
Q97I96_1:75 
MKQLLETIAKSLVDCPDEVQVSEVTGEQSIILELKVAPEDMGKVIGKQGRIAKAIRTVIK
UPI000039B24B_1:75 
MKELLITLAKALVDHPDQVSVNQIEGEKSVILELRVAQEDMGKVIGKQGRIARAIRTLVK
Q6MEB8_1:75 
MKEFVAYIVKNLVDHPDKVKINEIGGTQTLIIELSVEKSDIGKIIGKKGKTINAIRTLLM
Q9Z7I5_1:75 
MKEFLAYIIKNLVDRPEEVRIKEVQGTHTIIYELSVAKPDIGKIIGKEGRTIKAIRTLLV

CSV:                  669136346926994
Q8XJP5_1:75           AAAIKENKKVVVEII
Q8R9X2_3:75           AAALKEKKRVIVEII
Q97I96_1:75           AAAVKENKRVVVEII
UPI000039B24B_1:75    AAAAHEGKRVVVEII
Q6MEB8_1:75           SVASRNGIRVNLEIL
Q9Z7I5_1:75           SVASRNNVRVSLEIM

or even in numeric form (only first 20 residues shown):

1     M       1.130
2     K       1.130
3     E       0.001
4     L      -0.120
5     V      -0.131
6     E      -1.379
7     I      -0.846
8     I      -0.012
9     A      -0.818
10    K       1.130
11    S      -0.867
12    L       1.130
13    V       1.130
14    D       1.130
15    K      -1.765
16    P       1.130
17    E      -0.116
18    D      -1.769
19    V       1.130
20    H      -2.128

Hope this helps.

Mensur


At 02:11 AM 7/21/2005, you wrote:
>Is there is any multiple alignment program which outputs a sequence
>identity/similarity/dissimilarity matrix along with other output files?
>
>This should be possible by ClustalW..any pointers about the options to be
>used?
>
>Thanks
>
>Rajesh

==========================================================================
| Mensur Dlakic, PhD                | Tel: (406) 994-6576                |
| Department of Microbiology        | Fax: (406) 994-4926                |
| Montana State University          |                                    |
| 109 Lewis Hall, P.O. Box 173520   | http://myprofile.cos.com/mensur    |
| Bozeman, MT 59717-3520            | E-mail: mdlakic at montana.edu        |
==========================================================================



More information about the ssml-general mailing list