[BiO BB] ortholog
    Mike Marchywka 
    marchywka at hotmail.com
       
    Thu Sep 13 17:37:48 EDT 2007
    
    
  
If you are interested in doing custom comparisons or looking for differences 
by functional area,
it turns out that the prosite rules seem to be downloadable , they sent me
a link today:
ftp://ftp.expasy.org/databases/prosite/prosite.dat
I had to ignore their matrix data but their pattern library was easily 
convertible
into PERL ( as far as I have looked, obvious caveats for bugs etc- the 
canned
c++ regex code I lifted from Microsoft may not be bug free etc) and
it gave me quick graphical and textual compare results on 1000+ rules.
The point here is that you can make your own rules as you read the 
literature
( that is my plan anyway ) and implement ad hoc splicing or translation 
schemes
( pretend you want to model flakey ribosomes).
Anyway, I get stuff like this:
Translated rule matches generates rule hit files:
$ $progpath/rules_annotater -clean -which 1 -fastas o2_fasta -xrules 
$progpath/prosite_rules > pro1
$ $progpath/mm_align_tool -fastas o2_fasta -rules pro0 -rules pro1 -stats
For Rules set 0:>ref|NW_876253.1|Cfa11_WGA39_2:47189155-47195387 Canis 
familiar
is chromosome 11 genomic contig, whole genome shotgun sequence
97         >rule|13|PEPDTIDE Prosite MICROBODIES_CTER
68         >rule|3|PEPDTIDE Prosite PKC_PHOSPHO_SITE
64         >rule|6|PEPDTIDE Prosite MYRISTYL
47         >rule|4|PEPDTIDE Prosite CK2_PHOSPHO_SITE
46         >rule|11|PEPDTIDE Prosite PRENYLATION
30         >rule|1|PEPDTIDE Prosite ASN_GLYCOSYLATION
10         >rule|2|PEPDTIDE Prosite CAMP_PHOSPHO_SITE
10         >rule|5|PEPDTIDE Prosite TYR_PHOSPHO_SITE
6          >rule|7|PEPDTIDE Prosite AMIDATION
3          >rule|87|PEPDTIDE Prosite LEUCINE_ZIPPER
2          >rule|12|PEPDTIDE Prosite ER_TARGET
1          >rule|1087|PEPDTIDE Prosite THIONIN
1          >rule|973|PEPDTIDE Prosite TUBULIN_B_AUTOREG
For Rules set 1:>gb|AACN010493556.1|:1-1146 Canis familiaris 
ctg19866850213054,
whole genome shotgun sequence
23         >rule|13|PEPDTIDE Prosite MICROBODIES_CTER
9          >rule|1|PEPDTIDE Prosite ASN_GLYCOSYLATION
8          >rule|11|PEPDTIDE Prosite PRENYLATION
8          >rule|3|PEPDTIDE Prosite PKC_PHOSPHO_SITE
8          >rule|6|PEPDTIDE Prosite MYRISTYL
7          >rule|4|PEPDTIDE Prosite CK2_PHOSPHO_SITE
2          >rule|5|PEPDTIDE Prosite TYR_PHOSPHO_SITE
1          >rule|12|PEPDTIDE Prosite ER_TARGET
1          >rule|7|PEPDTIDE Prosite AMIDATION
1          >rule|87|PEPDTIDE Prosite LEUCINE_ZIPPER
This turned out to be easyto align as the sequences are largely identical ( 
the lone "G" is
the mismatch in this excerpt ) but you get the idea:
$ $progpath/mm_align_tool -fastas o2_fasta -rules pro0 -rules pro1 -use_rule 
13
-align -output text
[...]
Start at 696 and 2373:
          GGCCATTTTGCAACTCATGCATGAGCTACCTTTAGTTCCCCTTCTACATCTGAGAACTGT
          CCCATATAGAATATTTTATAAAACAAGATGGCATTGTGCTAAGTAAAATGCAGAACAAAA
                                     G
          TCAGTATCCCATTAGACATGTCATATTCAGAGTTTATTTTTATCCTTGCACTGAAAGAAT
          GATTGTAAATCAATGGTTTCTTTTTGTTTCTTGACTGTGGCAGTGTTCTGGCTCCAAATG
          ATGGAGATTCCAAATAAGCATTACAGCTTGGCAGGAAATGCCAGTTCAGATATTTGTGAG
          ATCCTAAAGAATAGATCTGGACACATAT
_________________________________________________________________
More photos; more messages; more whatever. Windows Live Hotmail - NOW with 
5GB storage. 
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_5G_0907
    
    
More information about the BBB
mailing list