[BiO BB] question on RNA and species signatures

Mike Marchywka marchywka at hotmail.com
Wed Jul 25 15:58:36 EDT 2007


I've been generally trying to find a comprehensive way to analyze non-coding 
RNA
with no luck. I've tried asking people in such areas as siRNA, riboswitch 
etc with out much
success. Any comments or discussion?

This came up most recently because I found a short sequence with unusual
species distribution and I was curious to know if this thing has a name.

If I just type in some random junk, I get about what you could expect:
( this is my own blast script with most terms being self explanatory, 
"-summ"
translates into "-v" to limit summary lines, -db selects the wgs database )
  567  blastnew -out control -nuc -hits 0 -summ 3000 -db wgs -expect 1e8 
TCCTGGAGTCCCAGAGTTCAGCTAAACCGATCACATTGTAT

$ more control| sed -n '/producing signif/,/^>/p'| sed -n 's/.*|//p' | awk 
'{print $1" " $2}' | sort | uniq -c | sort -g -r | more

304 Homo sapiens
261 Bos taurus
212 Pan troglodytes
171 Microcebus murinus
155 Equus caballus
136 Spermophilus tridecemlineatus
130 Canis familiaris
125 Otolemur garnettii
112 Ornithorhynchus anatinus
111 Tupaia belangeri
96 Myotis lucifugus
93 Mus musculus
86 Felis catus
75 Rattus norvegicus
74 Oryzias latipes
71 Drosophila erecta
68 Sorex araneus
63 Loxodonta africana
58 Anolis carolinensis
56 Monodelphis domestica
48 Macaca mulatta
47 Gallus gallus
34 Oryctolagus cuniculus
30 Erinaceus europaeus
27 Strongylocentrotus purpuratus
27 Callorhinchus milii
26 Dasypus novemcinctus
22 Echinops telfairi
22 Cavia porcellus
17 Danio rerio
14 Schmidtea mediterranea
13 Ochotona princeps
13 Aplysia californica
10 Anopheles gambiae

This on the other hand, has much better matches ( note expect limit )
  573  blastnew -out dog_sign -nuc -hits 0 -summ 3000 -db wgs -expect .01 
TCCTGGAGTCCCAGGATCCAGTCCCACGTCGGGCTCCCT
and it is confined to dogs:
$ more dog_sign| sed -n '/producing signif/,/^>/p'| sed -n 's/.*|//p' | awk 
'{print $1" " $2}' | sort | uniq -c | sort -g -r | more
   3000 Canis familiaris

And these all seem to be in different places ( most frequent location occurs 
once):

$ more dog_sign| sed -n '/producing signif/,/^>/p'| sed -n 's/.*|//p' |awk 
'{print $3}'| sort | uniq -c | sort -g -r | more
      1 ctg19866851899833,
      1 ctg19866851899815,
      1 ctg19866851899794,



Anyone care to comment on significance of this sequence, or reason it is 
just an uninteresting
fluke?

Thanks.


Mike Marchywka
586 Saint James Walk
Marietta GA 30067-7165
404-788-1216 (C)<- leave message
989-348-4796 (P)<- emergency only
marchywka at hotmail.com

_________________________________________________________________
http://liveearth.msn.com




More information about the BBB mailing list