PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_013853.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_013853 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1smi_2089smi_2062Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_2089126-7.401874chromosome partitioning protein
smi_2088125-7.741181serine protease
smi_2087128-6.87817423S rRNA
smi_2085126-7.612612*competence stimulating peptide (CSP) precursor
smi_2084122-6.596533histidine kinase
smi_2083021-5.693325response regulator
smi_2080-122-4.451299**hypothetical protein
smi_2079-121-4.542491hypothetical protein
smi_2078-119-4.425593hypothetical protein
smi_2077-119-2.997563ABC-F family ATPase
smi_2076-223-3.996426tryptophanyl-tRNA synthetase
smi_2075-123-4.253430inosine monophosphate dehydrogenase
smi_2074-226-5.581987recombination protein recF
smi_2073-325-4.662048hypothetical protein
smi_2072-326-3.998724zinc-dependent protease
smi_2071-426-3.439908zinc-dependent protease
smi_2070-325-2.719896hypothetical protein
smi_2069-226-3.213465phosphatidylglycerophosphate synthase
smi_2068025-2.846390ABCtransporter ATP-binding protein cobalt
smi_2067222-2.500861cobalt ABC transporter ATP-binding protein
smi_2066524-2.363636ABC transporter permease, cobalt transport
smi_2065629-1.754955cell shape determining protein MreC
smi_2064630-1.299065cell shape determining protein MreD
smi_2063525-0.544395general stress protein GSP-781 amidase
smi_2062220-0.57726430S ribosomal protein S2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2088V8PROTEASE621e-12 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 61.6 bits (149), Expect = 1e-12
Identities = 31/165 (18%), Positives = 58/165 (35%), Gaps = 34/165 (20%)

Query: 117 IVTNNHVINGASKVDIRLS------------DGTKVPGEIVGADTFSDIAVVKISSEKVT 164
++TN HV++ L +G +I D+A+VK S +
Sbjct: 114 LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQN 173

Query: 165 -------TVAEFGDSSKLTVGETAIAIGSPLG-SEYANTVTQGIVSSLNRNVSLKSEDGQ 216
A ++++ V + G P ++G ++
Sbjct: 174 KHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITY------------- 220

Query: 217 AISTKAIQTDTAINPGNSGGPLINIQGQVIGITSSKIATNGGTSV 261
+ +A+Q D + GNSG P+ N + +VIGI + +V
Sbjct: 221 -LKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNEFNGAV 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_208756KDTSANTIGN270.037 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.2 bits (60), Expect = 0.037
Identities = 16/46 (34%), Positives = 22/46 (47%), Gaps = 1/46 (2%)

Query: 14 KYLKDGIAEYSKRISRFAKLEMIELADEKTPDRASESENQ-KILEI 58
K L D I + I FA + I + D P+ AS + Q KI E+
Sbjct: 262 KVLSDKIIQIYSDIKPFADIAGINVPDTGLPNSASIEQIQSKIQEL 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2080HTHTETR491e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.2 bits (117), Expect = 1e-09
Identities = 22/104 (21%), Positives = 44/104 (42%), Gaps = 8/104 (7%)

Query: 6 KRLKTKRTIENAMVQLLMEQPFDQISTVKLAEKAGISRSSFYTHYKDKYDMIEHYQSKLF 65
+ +T++ I + ++L +Q S ++A+ AG++R + Y H+KDK D+
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 66 HTF-EYIFQKHAHHK-------RDAILEVFEYLESEPLLAALLS 101
E + A R+ ++ V E +E L+
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2079ICENUCLEATIN340.003 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 34.0 bits (77), Expect = 0.003
Identities = 59/181 (32%), Positives = 74/181 (40%), Gaps = 33/181 (18%)

Query: 479 STSLTGLSSGLTEIQGTLTSKLVPASQSITSGVNAY-TAGVDK---VSQGASQLSEKNST 534
ST G S LT G+ ++ S ++T+G + TAG D G+S S S
Sbjct: 982 STQTAGYQSTLTAGYGS--TQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSF 1039

Query: 535 LTGSLDQLVSGSTTLTQKSSKLTAGVGQLVEKTPELVSGIEKLST---GSNQLNQKSQEL 591
LT GST ++ S LTAG G L+SG T GSNQ+ L
Sbjct: 1040 LTAGY-----GSTLISGLRSVLTAGYGS------SLISGRRSSLTAGYGSNQIASHRSSL 1088

Query: 592 IAGVDKLQ-----------SGSSQLADKSSQLISGAS--QLESGANKLADGAGKLAEGGT 638
IAG + Q GSSQ A S LISGA Q+ KL GA G
Sbjct: 1089 IAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGD 1148

Query: 639 K 639
+
Sbjct: 1149 R 1149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2077PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 11/30 (36%), Positives = 14/30 (46%)

Query: 32 LIGANGAGKSTFLKILAGDIEPTTGHISLG 61
L G G GKST + L G + H +G
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2063GPOSANCHOR515e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.8 bits (121), Expect = 5e-09
Identities = 40/230 (17%), Positives = 76/230 (33%), Gaps = 9/230 (3%)

Query: 27 AETTDDKIAAQDNKISNLTAQQQEAQKQVDQVQEQVSAIQTEQSNLQSENDRLQAESKKL 86
D ++ + +KI L A++ + +K ++ +A + L++E L A L
Sbjct: 101 LRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL 160

Query: 87 EGEITELSKNIVSRNDSLQ-----KQARSAQTNGAATNYINTIVNSKSITEAISRVAAMS 141
E + + + ++ K A A+ + S + + I + A
Sbjct: 161 EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 220

Query: 142 EIVSANNKMLEQQKADKKAISEKQVANNDAINTVIA----NQQKLADDAQSLTTKQAELK 197
++A LE+ S A + A Q +L +
Sbjct: 221 AALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADS 280

Query: 198 AAELNLAAEKATAEDEKASLLEKKAAAEAEAKAAAEAEAAYKAKQASQQQ 247
A L AEKA E EKA L + A ++ A + + +
Sbjct: 281 AKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330



Score = 33.9 bits (77), Expect = 0.001
Identities = 24/199 (12%), Positives = 67/199 (33%), Gaps = 5/199 (2%)

Query: 31 DDKIAAQDNKISNLTAQQQEAQKQVDQVQEQVSAIQTEQSNLQSENDRLQAESKKLEGEI 90
+ + A + + ++ ++ + +A+ +++L+ + S +I
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 91 TELSKNIVSRNDSLQKQARSAQTNGAATNYINTIVNSKSITEAISRVAAMSEIVSANNKM 150
L + + ++ + + + I + AA+ +
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADS-----AKIKTLEAEKAALEAEKADLEHQ 303

Query: 151 LEQQKADKKAISEKQVANNDAINTVIANQQKLADDAQSLTTKQAELKAAELNLAAEKATA 210
+ A+++++ A+ +A + A QKL + + + L+ K
Sbjct: 304 SQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL 363

Query: 211 EDEKASLLEKKAAAEAEAK 229
E E L E+ +EA +
Sbjct: 364 EAEHQKLEEQNKISEASRQ 382



Score = 31.6 bits (71), Expect = 0.006
Identities = 40/219 (18%), Positives = 93/219 (42%), Gaps = 18/219 (8%)

Query: 28 ETTDDKIAAQDNKISNLTAQQQEAQKQVDQVQEQVSAIQTEQSNLQSENDRLQAESKKLE 87
+T + + AA + + + L + A ++ ++ E++ L++E L+ +S+ L
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLN 308

Query: 88 GEITELSKNIVSRNDSLQKQARSAQTNGAATNYINTIVNSKSITEAISRVAAMSEIVSAN 147
L +++ + ++ ++ Q + I+EA SR + ++ ++
Sbjct: 309 ANRQSLRRDLDASREAKKQLEAEHQK----------LEEQNKISEA-SRQSLRRDLDASR 357

Query: 148 NKMLEQQKADKKAISEKQVANNDAINTVIANQQKLADDAQSLTTKQAELKAAELNLAA-E 206
+ + +K + +++ + ++ L ++ + L+ A LAA E
Sbjct: 358 EAKKQLEAEHQKLEEQNKISEASRQSL----RRDLDASREAKKQVEKALEEANSKLAALE 413

Query: 207 KATAEDE--KASLLEKKAAAEAEAKAAAEAEAAYKAKQA 243
K E E K ++KA +A+ +A A+A AKQA
Sbjct: 414 KLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQA 452


2smi_2023smi_1999Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_2023-1193.883916ABC zinc transporter, metal-binding lipoprotein
smi_20220213.7502635'-nucleotidase
smi_20211192.208754hypothetical protein
smi_20202190.894402beta-galactosidase/beta-glucuronidase,
smi_20192190.187820hypothetical protein
smi_2018220-0.760186fucosidase
smi_2017123-4.738319sugar hydrolase
smi_2016125-4.948848phage-related integrase, recombinase
smi_2015030-2.335535transcriptional regulator
smi_2014033-1.538167transcriptional regulator
smi_2013035-0.510626transcriptional regulator
smi_2012136-1.574555hypothetical protein
smi_20113401.160970hypothetical protein
smi_20105340.207017hypothetical protein
smi_2009432-0.170821hypothetical protein
smi_20084300.460313hypothetical protein
smi_2007427-0.116118hypothetical protein
smi_2006426-0.456731hypothetical protein
smi_2005326-1.234779replication protein
smi_2004-1221.336338phage-related DNA primase
smi_2003-1172.581933hypothetical protein
smi_2002-2173.255289hypothetical protein
smi_2001-1204.375784hypothetical protein
smi_2000-1224.132716hypothetical protein
smi_1999-2213.882675transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2001PF07520260.022 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 25.7 bits (56), Expect = 0.022
Identities = 10/37 (27%), Positives = 19/37 (51%), Gaps = 5/37 (13%)

Query: 9 IEARGEGDSPHISLWFDKQLRDVFISLISLSFCKQLR 45
+EA +G++ + LW L+++F L F + R
Sbjct: 195 LEADEDGNAVDLQLWVSDWLKEMF-----LDFKRAER 226


3smi_1989smi_1948Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_19893201.528992hypothetical protein
smi_19882191.788097hypothetical protein
smi_1987-1182.273785hypothetical protein
smi_1986-1192.996632hypothetical protein
smi_1985-1203.664141hypothetical protein
smi_1984-1203.560401transcriptional regulator
smi_1983-1153.030792cyanate permease
smi_1982-2172.279071glycosyl transferase
smi_1981-1192.022990nucleoside-diphosphate sugar isomerase
smi_1980020-1.078808phosphatase
smi_1979123-5.331153L-serine dehydratase, alpha subunit
smi_1978229-7.181608L-serine dehydratase, beta subunit
smi_1977332-8.472029LysM domain protein
smi_1976127-7.217743transcriptional regulator
smi_1975126-6.725331hypothetical protein
smi_1974121-5.013221hypothetical protein
smi_1973018-2.799903hypothetical protein
smi_1972-117-1.427660ABC transporter ATP-binding protein
smi_1971-116-1.112825hypothetical protein
smi_1970-117-0.918945hypothetical protein
smi_1969-1150.397008ABC transporter ATP-binding protein
smi_1968-3141.996409ABC transporter substrate-binding protein
smi_1967-2172.357403argininosuccinate synthase
smi_1966-1214.013185argininosuccinate lyase
smi_19650234.718173hypothetical protein
smi_19640255.296387hypothetical protein
smi_19630224.874741transposase, ISSmi1
smi_1962-1223.680596tRNA methyl transferase
smi_19610233.412692hypothetical protein
smi_1960-1241.412294NAD/FAD-binding enzyme GidA
smi_19591250.909829hydrolase
smi_1958-2251.745105hypothetical protein
smi_19570252.769628hypothetical protein
smi_19560254.945641cibC
smi_19551233.886840excreted peptide
smi_1954-1223.094094excreted peptide
smi_19531203.083492transposase, ISSmi1
smi_19520202.654015molecular chaperone
smi_19510141.797807acetyltransferase
smi_1950113-0.206031metal-dependent protease
smi_19494250.213897transcriptional regulator
smi_19482202.126485metal-dependent CAAX amino terminal membrane
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1988IGASERPTASE424e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 4e-06
Identities = 29/165 (17%), Positives = 58/165 (35%), Gaps = 14/165 (8%)

Query: 7 ESNDFVKTSSKNKPDEQAQDGADKAEETIPDLDTPIEKNTQLEKEVSQAEAELESQQEEK 66
++ + K + N + ++ + T K T ++ +A+ E E QE
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 67 IETPEDS--EAKTETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLL 124
T + S + ++ET + +A + E +P +E Q + T +
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDP--------TVNIKEPQSQTNTTADTEQPAKET 1175

Query: 125 SKSLESPYIPDQAQKSTDRWKEQVLDFWSWLVEALKSPTSKLETS 169
S ++E P + + E + A PT E+S
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVENPEN----TTPATTQPTVNSESS 1216



Score = 36.6 bits (84), Expect = 2e-04
Identities = 26/133 (19%), Positives = 51/133 (38%), Gaps = 2/133 (1%)

Query: 9 NDFVKTSSKNKP-DEQAQDGADKAE-ETIPDLDTPIEKNTQLEKEVSQAEAELESQQEEK 66
N V T++ P + QA + + E I +D E E+ ++E
Sbjct: 989 NQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQES 1048

Query: 67 IETPEDSEAKTETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLLSK 126
++ + TET + + E + ++ T+ A+ E Q T TKE + K
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 127 SLESPYIPDQAQK 139
++ ++ Q+
Sbjct: 1109 EEKAKVETEKTQE 1121



Score = 33.5 bits (76), Expect = 0.002
Identities = 32/159 (20%), Positives = 57/159 (35%), Gaps = 21/159 (13%)

Query: 20 PDEQAQDGADKAEETIPDLDTPIEKNTQ--LEKEVSQAEAELESQQEEKIETPEDSEAKT 77
P E + E +EKN Q E E E++ K T + A++
Sbjct: 1033 PSETTE----TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS 1088

Query: 78 ETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLLSKSLESPYIPDQA 137
+E K+ + +KET V K EE + +++ T + P + S+ +
Sbjct: 1089 GSETKET------QTTETKETATVEK-EEKAKVETEK--TQEVPKVTSQVSPKQEQSETV 1139

Query: 138 QKSTDRWKEQVLDFWSWLVEALKSPTSKLETSSTHSYTA 176
Q + +E +K P S+ T++ A
Sbjct: 1140 QPQAEPAREND------PTVNIKEPQSQTNTTADTEQPA 1172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1986TCRTETB280.025 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.025
Identities = 19/103 (18%), Positives = 36/103 (34%), Gaps = 4/103 (3%)

Query: 76 NRQILHIALLAL---LAAPIGIPLGIAILVSL-FAILVAALTVILAFFAVSILGIIGGFL 131
N +L+++L + P + L F+I A + + L + G +
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIII 88

Query: 132 FLVESFTVLAQAKSAFILIFGSGLLAIGASSLVLLGISYVARF 174
S +LI + GA++ L + VAR+
Sbjct: 89 NCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1984TCRTETA445e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 5e-07
Identities = 64/360 (17%), Positives = 118/360 (32%), Gaps = 17/360 (4%)

Query: 6 LFFVPGIILIGVSLRTPFTVLPIILGDISQGLGVEVSSLGVLTSLPLLMFTLFSLFSTRL 65
+ + +G+ L P VLP +L D+ + G+L +L LM + L
Sbjct: 10 ILSTVALDAVGIGLIMP--VLPGLLRDLVHS-NDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 66 AQKIGLEHFFTYSLFFLTIGSLIRLI--NLPLLYLGTL---MVGASIAVINVLLPSLIQA 120
+ + G SL + I L +LY+G + + GA+ AV + +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 121 NQ-PKKIGFLTTLYVTSMGIATALASYLAVPITQASSWKGLILLLTLLCLATFLVWLP-- 177
++ + GF++ + M L + A + L FL+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 178 -NHRYNHRLAPQTKQKSQTKVMHNKQVWAVIVFAGFQSLLFYTAMTWLPTMAIHAGLSSH 236
R R A + + VF Q + A W+ +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDAT 246

Query: 237 EAGLLTSIFSLISIPFSMTIPSLTTSLSTRNRQLMLTLVSLAGMVGISMLFFPVGNFFYW 296
G+ + F ++ I + R LML + +A G +L F W
Sbjct: 247 TIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGM--IADGTGYILLAFATR---GW 301

Query: 297 LAIHLLIGTATSALFPYLMVNFSLKTSAPEKTAQLSGLSQTGGYILAAFGPTLFGYSFDL 356
+A +++ A+ + + + E+ QL G + + GP LF +
Sbjct: 302 MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1983BCTERIALGSPF385e-05 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 37.5 bits (87), Expect = 5e-05
Identities = 23/85 (27%), Positives = 40/85 (47%), Gaps = 10/85 (11%)

Query: 186 DLLWLNMIATGAKTGNLDQILCQVRVGAGMFERRGGLRYLKLYRQARQRMLKRGQISYME 245
+ L+ M+A G +G+LD +L ++ A E+R +R + +Q M+
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRL---ADYTEQRQQMR-----SRIQQAMIY--PCVLTV 181

Query: 246 YAKSVAIQMVVALCPGFVRQFIFMK 270
A +V ++ + P V QFI MK
Sbjct: 182 VAIAVVSILLSVVVPKVVEQFIHMK 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1982NUCEPIMERASE819e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 81.0 bits (200), Expect = 9e-19
Identities = 50/284 (17%), Positives = 91/284 (32%), Gaps = 57/284 (20%)

Query: 294 TILVTGAGGSIGSEICRQ----------VSRFNPERIVLLGHGENSIYLVYHELIRKFQG 343
LVTGA G IG + ++ + N V L EL+ +
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQAR-------LELLAQ--- 51

Query: 344 IDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAR 403
+ D+ D + + +F V+ + V NP +N+ G N+
Sbjct: 52 PGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILE 111

Query: 404 AVDEAKVPKMVMIST---------------DKAVNPPNVMGATKRVAELIVTGFNQRSQS 448
K+ ++ S+ D +P ++ ATK+ EL+ ++
Sbjct: 112 GCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171

Query: 449 TYCAVRFGNVLGSRGS---VIPVFERQIAEGGPVTV-TDFRMTRYFMTI----------- 493
+RF V G G + F + + EG + V +M R F I
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 494 -------PEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKM 530
+ + A V+ + PV++ D + +
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1978PF03544300.004 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 30.3 bits (68), Expect = 0.004
Identities = 19/64 (29%), Positives = 24/64 (37%), Gaps = 3/64 (4%)

Query: 66 VHMIYVGQELVIDGPAAPVAPASTTYEAPAAQDE--AVSATVAETIEVEEETPAASGTVA 123
++Y VI+ PA P P S T APA + AV +E E E
Sbjct: 30 AGLLYTSVHQVIELPA-PAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPK 88

Query: 124 EETV 127
E V
Sbjct: 89 EAPV 92


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1959BACINVASINB250.036 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 25.1 bits (54), Expect = 0.036
Identities = 12/43 (27%), Positives = 23/43 (53%)

Query: 31 SELEGRIAARQLVEENRPEYNIEYIELLSNKLLDYEKETGAFE 73
S+LE R+A Q + E++ E I+ + L + ++ T +E
Sbjct: 102 SQLESRLAVWQAMIESQKEMGIQVSKEFQTALGEAQEATDLYE 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1952SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.1 bits (70), Expect = 0.001
Identities = 22/75 (29%), Positives = 34/75 (45%), Gaps = 7/75 (9%)

Query: 48 LAYDGTEVIGFLAVQENIFE-AEVLQIAVKGAYQGKGIASAL------FAQLSTDKEIFL 100
L Y IG + ++ N A + IAV Y+ KG+ +AL +A+ + + L
Sbjct: 69 LYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLML 128

Query: 101 EVRKSNQRAQAFYKK 115
E + N A FY K
Sbjct: 129 ETQDINISACHFYAK 143


4smi_1935smi_1929Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
smi_19350273.687606hypothetical protein
smi_19340274.394234NrdI family protein
smi_19330263.796113hypothetical protein
smi_19320223.949044transcriptional regulator
smi_1931-1233.675898hypothetical protein
smi_1930-1223.859950DNA mismatch repair protein hexB
smi_19290213.917429holiday junction DNA helicase RuvA
5smi_1870smi_1847Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1870220-0.446760leucyl-tRNA synthetase
smi_1869121-0.559854ABC transporter ATP-binding protein
smi_18680220.026253ABC transporter permease
smi_18671211.971492transcriptional regulator
smi_18661220.864457hypothetical protein
smi_18632270.885174acetyltransferase
smi_18621272.389119acetyltransferase
smi_18610242.720502transposase, ISSmi3
smi_1860-1212.368507hypothetical protein
smi_18590191.665815hypothetical protein
smi_1857-2152.617859holliday junction DNA helicase RuvB
smi_1856-2143.027769hypothetical protein
smi_1855-1173.039246transposase, ISSmi3
smi_1854-1153.242181transposase, ISSmi1
smi_1853-2153.189829UDP diphosphate synthase
smi_18520223.467002phosphatidate cytidylyltransferase
smi_18511263.904113metallo protease
smi_18502283.920771prolyl-tRNA synthetase
smi_18491263.300477glycosyl hydrolase
smi_18482282.928953glutamine--fructose-6-phosphate transaminase
smi_18474302.979686oxidoreductase
6smi_1806smi_1775Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_18060183.059690rRNA methylase
smi_18050183.032441FMN riboswitch (RFN element)
smi_18040163.071918hypothetical protein
smi_18030163.385421phosphatase
smi_18020152.778375hypothetical protein
smi_18010183.211196hypothetical protein
smi_1800-1140.111257T-box leader
smi_1799-116-0.718524phenylalanyl-tRNA synthetase, alpha chain
smi_1798122-2.951108hypothetical protein
smi_1797130-6.808120phenylalanyl-tRNA synthetase, beta chain
smi_1796030-7.997393Na+ ABC transporter ATP-binding protein
smi_1795133-9.348456ABC transporter permease, Na+ export
smi_1794237-10.159646transposase, ISSmi1
smi_1793237-10.14830050S ribosomal protein L13
smi_1792235-10.16965330S ribosomal protein S9
smi_1791133-9.813166integrase
smi_1790134-9.554792hypothetical protein
smi_1789132-9.582399hypothetical protein
smi_1788028-8.246200serine/threonine metallophosphatase
smi_1787127-8.191826Ser/Thr protein phosphatase
smi_1786030-7.310126hypothetical protein
smi_1785.1230-7.298191FtsK/SpoIIIE family protein
smi_1785128-7.220958hypothetical protein
smi_1784322-5.100402hypothetical protein
smi_1783219-2.782786hypothetical protein
smi_1782219-0.572307hypothetical protein
smi_1781319-0.381251hypothetical protein
smi_1780220-0.373276drug transporter, major facilitator superfamily
smi_17792221.515985transcriptional regulator
smi_17783262.141485integrase/recombinase, phage integrase family,
smi_17773240.738007integrase/recombinase, phage integrase family,
smi_1776320-1.543398ATPase
smi_1775323-1.425953oligopeptide permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1803GPOSANCHOR290.033 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 29.3 bits (65), Expect = 0.033
Identities = 19/89 (21%), Positives = 35/89 (39%), Gaps = 8/89 (8%)

Query: 32 EEQLKALREETLASLKQIT-AENEKEMQDLRVSVLGKKGSLTEI--LKGMKDVSAEMRPI 88
E + L + Q+ A + +DL S KK E L+ +S R
Sbjct: 294 EAEKADLEHQ-----SQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 348

Query: 89 IGKHVNEARDVLTAAFEETAKLLEEKKVA 117
+ + ++ +R+ E KL E+ K++
Sbjct: 349 LRRDLDASREAKKQLEAEHQKLEEQNKIS 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1794PF08280270.013 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 26.7 bits (59), Expect = 0.013
Identities = 8/49 (16%), Positives = 21/49 (42%), Gaps = 2/49 (4%)

Query: 15 SMITNLLQNEISHFKE--DLGLDSPYLNKGQTCKYLGISNNTLDGWIQK 61
S+I L++ I + L + L + + G++ L+ + ++
Sbjct: 33 SLIEKYLESSIESKCQLVVLFFKTSSLPITEVAEKTGLTFLQLNHYCEE 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1784TCRTETB1132e-29 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 113 bits (283), Expect = 2e-29
Identities = 95/409 (23%), Positives = 177/409 (43%), Gaps = 22/409 (5%)

Query: 12 LLFILSLGYIMAVLDTTGVVLAVPHIETAMSVSLEQSIWIINAYTLALGSLLLLSGNLTT 71
+L L + +VL+ + +++P I + + W+ A+ L + G L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 72 KYGAKRILLIGMTIFTLASLGCSFSPNIETLIIL-RFIQGFGTSLFMPSSLALLFISYPD 130
+ G KR+LL G+ I S+ + +L+I+ RFIQG G + F P+ + ++ Y
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAF-PALVMVVVARYIP 133

Query: 131 STKRARMLGIWTAIISVATGTGSFIGGLIINYFGWRGIFLVNIPFGILTVISILLLVK-- 188
R + G+ +I+++ G G IGG+I +Y W +L+ IP ++T+I++ L+K
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIP--MITIITVPFLMKLL 189

Query: 189 NNIANLKTRIDIFSNIFLVTTIGSLVIYLVEGNQYGYSNSNLLLFLLLFILFAIGLVVHD 248
+K DI I + I ++ ++ S + FL++ +L + V H
Sbjct: 190 KKEVRIKGHFDIKGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIFVKHI 240

Query: 249 KKSKTPIVPHQLLKNAKFIISNLLGLVVNISLYGIVLVLGLYFQTYLNLSSMVSG-LLIL 307
+K P V L KN F+I L G ++ ++ G V ++ + LS+ G ++I
Sbjct: 241 RKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIF 300

Query: 308 PGMIVLIIGNLFYARAVKRFSVGSLATVSIIFAIVGAAGIFGIGVLFHEIQLYILIPLFS 367
PG + +II V R + + + F V L ++ I +
Sbjct: 301 PGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSV---SFLTASFLLETTSWFMTIIIVF 357

Query: 368 LMSLGIGVLTPATTTILMEAAGQELSGIAGATLNANKQIGGLFGTTIMG 416
++ T +T + QE +G + LN + G I+G
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQE-AGAGMSLLNFTSFLSEGTGIAIVG 405


7smi_1749smi_1693Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1749-1193.225657hypothetical protein
smi_1748-1202.589239DNA methylase
smi_17470232.846766hypothetical protein
smi_17460242.7951986-phosphogluconate dehydrogenase,
smi_17451232.252566response regulator
smi_17441161.843439choline binding protein CbpF
smi_1743-1202.331833mevalonate kinase
smi_1742-1212.824403mevalonate pyrophosphate decarboxylase
smi_1741-1233.081643phosphomevalonate kinase
smi_17400232.515333isopentenyl-diphosphate:dimethylallyl
smi_1739-1232.693162hypothetical protein
smi_1738-1233.834116histidine kinase
smi_17370222.458974response regulator
smi_17360191.167561DNA alkylation repair enzyme, truncation
smi_1735-215-0.520821FKBP-type peptidyl-prolyl cis-trans isomerase
smi_1734-117-2.072519ATP-dependent exoDNAse (exonuclease V), alpha
smi_1733018-2.982784signal peptidase I
smi_1732128-6.644192ribonuclease HIII
smi_1731-134-7.695244hypothetical protein
smi_1730029-6.311857hypothetical protein
smi_1729225-6.621971mismatch repair ATPase (MutS family)
smi_1728321-4.500714transcriptional regulator
smi_1727316-2.396231hypothetical protein
smi_1726314-1.432721HSP70 family protein
smi_17252120.205923dnaJ domain protein
smi_1724213-0.766823grpE domain protein
smi_1723115-0.718017hypothetical protein
smi_1722420-3.582772hypothetical protein
smi_1721425-6.104479choline binding domain Cbp6
smi_1720528-6.708890choline binding protein cbp12
smi_1719427-6.385581hypothetical protein
smi_1718526-5.618107Na+/alanine symporter
smi_1717525-5.031705hypothetical protein
smi_1716527-2.488440exfoliative toxin A
smi_1714627-2.347161TetR family transcriptional regulator
smi_1713628-1.634419ABC transporter ABC nucleotide-binding domain
smi_1712932-2.204298hypothetical protein
smi_1711833-1.394015site specific recombinase, authentic frame
smi_1710934-2.582102hypothetical protein
smi_1709835-7.375020site-specific recombinase
smi_1708737-7.844115nucleotidyltransferase
smi_1707940-9.824502methyltransferase
smi_1706742-11.190171streptomycin aminoglycoside 6-adenyltransferase
smi_1705645-11.834782streptothricin acetyltransferase
smi_1704444-12.161064aminoglycoside 3'-phosphotransferase
smi_1703430-7.544278hypothetical protein
smi_1702434-8.904986hypothetical protein
smi_1701231-7.738366acetyl transferase
smi_1700427-5.358287aminoglycoside acetyltransferase and
smi_1699426-3.936117hypothetical protein
smi_1698325-3.216219hypothetical protein
smi_1697425-1.622462hypothetical protein
smi_16963260.377914hypothetical protein
smi_1695225-0.137701hypothetical protein
smi_1694324-1.431943multidrug ABC transporter ATPase/permease
smi_1693-125-3.288356AraC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1749HTHFIS511e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 50.6 bits (121), Expect = 1e-09
Identities = 21/111 (18%), Positives = 45/111 (40%)

Query: 4 RILLLEKERNLAHFLSLELQKEQYRVDQVEEGQKALSMALQTDYDLILLNARLGDMTAQD 63
IL+ + + + L+ L + Y V D DL++ + + D A D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FADKLSRTKPASVIMVLDHREELQGQIETIQRFAVSYIYKPVIIENLVARI 114
++ + +P ++V+ + I+ ++ A Y+ KP + L+ I
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1742PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 18/91 (19%), Positives = 43/91 (47%), Gaps = 9/91 (9%)

Query: 244 ILQELISNTLRHA-----QASCLDVYLYQTDVELQLKVVDNGIGFQLGSFDELSYGLRNI 298
++Q L+ N ++H Q + + + + + L+V + G + + GL+N+
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNV 318

Query: 299 KERVEDMAG---TVQLLTAPKQGLAVDIRIP 326
+ER++ + G ++L + A+ + IP
Sbjct: 319 RERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1741HTHFIS664e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.4 bits (162), Expect = 4e-15
Identities = 31/161 (19%), Positives = 61/161 (37%), Gaps = 14/161 (8%)

Query: 2 KILLVDDHEMVRLGLKSYFDLQD-DVEVVGEAANGSQGIDLALELRPDVIVMDIVMPEMN 60
IL+ DD +R L DV + N + D++V D+VMP+ N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 61 GIDATLAILKEWPEAKILIVTSYLDNEKIMPVLNAGAKGYMLKTSSADELLHAVRKVAAG 120
D I K P+ +L++++ + GA Y+ K EL+ + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 121 KLAIEQEVSKKVEYHRNHIELHEDLTARE---RDVLQLIAK 158
E ++ + + L R +++ +++A+
Sbjct: 118 ---ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1730SHAPEPROTEIN1299e-36 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 129 bits (327), Expect = 9e-36
Identities = 82/364 (22%), Positives = 142/364 (39%), Gaps = 51/364 (14%)

Query: 2 SKAIGIDLGTTYSAVSVLSETGQPQILLNQDGENLTPSVVFFQDFDGKDEP----LVGIQ 57
S + IDLGT + + V + I+LN+ PSVV D P VG
Sbjct: 10 SNDLSIDLGTANTLIYVKGQG----IVLNE------PSVVAI-RQDRAGSPKSVAAVGHD 58

Query: 58 AKNLAASSPEAVVQYVKRQMGNPNWKFDSPSDTVYTAEEISAIILKRLKEGAENALGDKV 117
AK + +P + R M D + E++ +K++ N+
Sbjct: 59 AKQMLGRTPGNIA--AIRPMK------DGVIADFFVTEKMLQHFIKQVHS---NSFMRPS 107

Query: 118 EDVVITVPAYFDDARRTATKHAGEIAGLNVLRVLNEPTAAALAYGISAEKNETVLVYDLG 177
V++ VP R A + + + AG + ++ EP AAA+ G+ + +V D+G
Sbjct: 108 PRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIG 167

Query: 178 GGTFDVTLMKIKDGEFDVIATDGDRNLGGFDFDNALSMIIAEKME----EQGAEDIYTDE 233
GGT +V ++ + + +GG FD A+ + E AE I E
Sbjct: 168 GGTTEVAVI-----SLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKH-E 221

Query: 234 HFTALLREKSENTK-RGLTTVEKTNVFLDYKGKSYKIPITRVEFEEATKSLMNRTEELLD 292
+A ++ + RG E G + E EA + + +
Sbjct: 222 IGSAYPGDEVREIEVRGRNLAE---------GVPRGFTLNSNEILEALQEPLTGIVSAVM 272

Query: 293 DVVEESG--MSWDEIDQ-VLLIGGSTRMPMVQRKLEEKIGKKIVYSINPDEAVAQGAAIQ 349
+E+ ++ D ++ ++L GG + + R L E+ G +V + +P VA+G
Sbjct: 273 VALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLTCVARGGGK- 331

Query: 350 AALE 353
ALE
Sbjct: 332 -ALE 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_172760KDINNERMP300.011 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 29.9 bits (67), Expect = 0.011
Identities = 13/57 (22%), Positives = 23/57 (40%), Gaps = 9/57 (15%)

Query: 11 RRNILYFILGFL-------WGRRQNAKISPEPPTLSTPK--HSELPSISKATHNGKM 58
+RN+L L F+ W + +N + + T +T S A+ GK+
Sbjct: 4 QRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1722ACRIFLAVINRP310.014 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.6 bits (69), Expect = 0.014
Identities = 37/262 (14%), Positives = 85/262 (32%), Gaps = 56/262 (21%)

Query: 161 TFTQLNAITESIQNTTTISPAITALVLSVLVAIAVFGGLKSISKISTAVVPFMAI-IYIL 219
+ + SI + A++L LV +++ ++P +A+ + +L
Sbjct: 326 PYDTTPFVQLSIH--EVVKTLFEAIMLVFLVMYLFLQNMRA------TLIPTIAVPVVLL 377

Query: 220 GTLTVIFFNVGKLPATIALILTSAFSPVAAVGGFAGASIRMAIQNGVARGVFSNESGLGS 279
GT ++ ++ + F V A+G +I + ++N V R + +
Sbjct: 378 GTFAILAAF------GYSINTLTMFGMVLAIGLLVDDAIVV-VEN-VERVMMED----KL 425

Query: 280 APIAAAAAKTNEPVEQGLISMTGTFIDTLI-ICTLTGLTILVTGVWSGDLNGVALTQSAF 338
P A ++ ++ L+ + I + G T G Q +
Sbjct: 426 PPKEATEKSMSQ-IQGALVGIAMVLSAVFIPMAFFGGST------------GAIYRQFSI 472

Query: 339 STVFSHFGPALLTIFLVLFAFTTIL------------GWNYYGERCFEFLFG-------- 378
+ V + L+ + L T+L G+ + F+
Sbjct: 473 TIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGK 532

Query: 379 -VRFIWLYRVVFVLMVLLGGFI 399
+ Y +++ L+V +
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVL 554


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1719HTHTETR703e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 70.4 bits (172), Expect = 3e-17
Identities = 29/161 (18%), Positives = 63/161 (39%), Gaps = 5/161 (3%)

Query: 10 RRAEIMDAAMILFMEKGYTNTTTQDIVDKVNISRGLLYYHFKNKEDILYCLVEQYSDRLL 69
R I+D A+ LF ++G ++T+ +I ++RG +Y+HFK+K D+ + E +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 70 KDIYIIAYDEDKTAIEKIRS----FIDVTIISSENISAEGTVLQKTIDLKENQYMIDKLS 125
+ + +R ++ T+ + K + E ++ +
Sbjct: 72 ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA-VVQQAQ 130

Query: 126 HKLVEKLTIYFEKILNQGIMERVFSVKYPLETAELLMTAYV 166
L + E+ L I ++ A ++M Y+
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1714PF03627280.013 PapG
		>PF03627#PapG

Length = 336

Score = 28.0 bits (62), Expect = 0.013
Identities = 16/73 (21%), Positives = 28/73 (38%), Gaps = 11/73 (15%)

Query: 74 DEETFEKAR-----EEKRKRAEKLG----RIREPKDEPKTDYPVKFKAKPLVQKYEDPYK 124
DE F+ E + EK ++ P D P DY V +Q++ Y
Sbjct: 128 DERAFDAGNLCQKPGETTRLTEKFDDIIFKVALPADLPLGDYSVTIPYTSGMQRHFASYL 187

Query: 125 QAEY--AYSLIES 135
A + Y++ ++
Sbjct: 188 GARFKIPYNVAKT 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1709SACTRNSFRASE289e-103 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 289 bits (740), Expect = e-103
Identities = 87/180 (48%), Positives = 124/180 (68%), Gaps = 7/180 (3%)

Query: 1 MITEMKAGHLKDIDKPSEPFEVIGKIIPRYENENWTFTELLYEAPYLKSYQDEEDEEDEE 60
MI +M ++KD +KP+EPF V G++IP +EN WT+TE + PY K Y+D++ +
Sbjct: 1 MIMKMTHLNMKDFNKPNEPFVVFGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMD---- 56

Query: 61 ADCLEYIDNTDKIIYLYYQDDKCVGKVKLRKNWNRYAYIEDIAVCKDFRGQGIGSALINI 120
+ Y++ K +LYY ++ C+G++K+R NWN YA IEDIAV KD+R +G+G+AL++
Sbjct: 57 ---VSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHK 113

Query: 121 SIEWAKHKNLHGLMLETQDNNLIACKFYHNCGFKIGSVDTMLYANFENNFEKAVFWYLRF 180
+IEWAK + GLMLETQD N+ AC FY F IG+VDTMLY+NF E A+FWY +F
Sbjct: 114 AIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSNFPTANEIAIFWYYKF 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1705SACTRNSFRASE444e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 4e-08
Identities = 25/98 (25%), Positives = 39/98 (39%), Gaps = 4/98 (4%)

Query: 59 DIDNLKGFLNDTSSFGFIAKENNKIIGFAYCYTLLRPDGKTMFYLHSIGMLPNYQDKGYG 118
D D ++ + F+ N IG +R + + I + +Y+ KG G
Sbjct: 52 DDDMDVSYVEEEGKAAFLYYLENNCIG----RIKIRSNWNGYALIEDIAVAKDYRKKGVG 107

Query: 119 SKLLSFIKEYSKEIGCSEMFLITDKGNPRACHVYEKLG 156
+ LL E++KE + L T N ACH Y K
Sbjct: 108 TALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHH 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1704SACTRNSFRASE348e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 8e-04
Identities = 15/51 (29%), Positives = 23/51 (45%), Gaps = 1/51 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
+Y KG+GT + E+ KE + ++L+ N A Y K F I
Sbjct: 99 KDYRKKGVGTALLHKAIEW-AKENHFCGLMLETQDINISACHFYAKHHFII 148


8smi_1682smi_1653Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_16822162.773313aspartate kinase
smi_16811162.829929enoyl-CoA hydratase/carnithine racemase
smi_16801162.709516transcription regulator, MarR family
smi_16790163.2019813-oxoacyl-ACP synthase III
smi_1678-1173.443406acyl carrier protein
smi_1677-2161.849852enoyl-ACP reductase II
smi_1676-2161.417677ACP S-malonyltransferase
smi_1675-2160.6265853-oxoacyl-ACP reductase
smi_1674023-0.3696993-oxoacyl-ACP synthase
smi_1673125-2.303923acetyl-CoA carboxylase
smi_1672-1260.2941903-hydroxymyristoyl/3-hydroxydecanoyl-ACP
smi_16710322.832906acetyl-CoA carboxylase biotin carboxylase
smi_16700333.095890acetyl-CoA carboxylase subunit beta
smi_16690302.700789acetyl-CoA carboxylase carboxyl transferase
smi_1668-2263.669119excreted peptide
smi_16673164.324376metal-dependent CAAX amino terminal membrane
smi_16662112.137293transcription termination factor nusB
smi_16652141.143569hypothetical protein
smi_16641150.549571translation elongation factor P
smi_1663016-0.461728hypothetical protein
smi_1662119-1.822378glutamyl tRNA-Gln amidotransferase chain B
smi_1661123-6.119770glutamyl-tRNA(Gln) amidotransferase, A subunit
smi_1660127-6.269558glutamyl tRNA-Gln amidotransferase C subunit
smi_1659127-6.626119hypothetical protein
smi_1658226-6.519852peptide chain release factor 3
smi_1657227-6.608404cell wall surface anchor family protein, Ser
smi_1656227-5.669813glycosyl transferase
smi_1655327-5.351194hypothetical protein
smi_1654221-3.200447hypothetical protein
smi_1653220-2.323959glycosyl transferase family 8
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1680DHBDHDRGNASE1249e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 124 bits (312), Expect = 9e-37
Identities = 79/254 (31%), Positives = 131/254 (51%), Gaps = 13/254 (5%)

Query: 3 LENKNIFITGSSRGIGLAIAHKFAQAGANIV-LNSRGAISEELLAEFSNYGVKVVPISGD 61
+E K FITG+++GIG A+A A GA+I ++ E++++ D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VSDFADAKRMVEQAIAELGSIDVLVNNAGITQDTLMLKMTEADFEKVLKVNLTGAFNMTQ 121
V D A + + E+G ID+LVN AG+ + L+ +++ ++E VN TG FN ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 SVLKPMIKAREGAIINMSSVVGLMGNIGQANYAASKAGLIGFTKSVAREVANRNIRVNAI 181
SV K M+ R G+I+ + S + A YA+SKA + FTK + E+A NIR N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 APGMIESDMTAVL------SDKVKDAMLAQ----IPMKEFGQAEQVADLTVFLAGQD--Y 229
+PG E+DM L +++V L IP+K+ + +AD +FL +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 230 LTGQVVAIDGGLSM 243
+T + +DGG ++
Sbjct: 246 ITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1664CARBMTKINASE280.003 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 28.3 bits (63), Expect = 0.003
Identities = 9/28 (32%), Positives = 17/28 (60%), Gaps = 1/28 (3%)

Query: 7 VTNLEVTKDD-IYKNPSNPILRMYDDDE 33
+T V K+D ++NP+ P+ YD++
Sbjct: 114 ITQTIVDKNDPAFQNPTKPVGPFYDEET 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1663TCRTETOQM2324e-71 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 232 bits (593), Expect = 4e-71
Identities = 108/451 (23%), Positives = 206/451 (45%), Gaps = 41/451 (9%)

Query: 9 KRRTFAIISHPDAGKTTITEQLLYFGGEIREAGTVKGKKTGTFAKSDWMDIEKQRGISVT 68
K +++H DAGKTT+TE LLY G I E G+V GT ++D +E+QRGI++
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVD---KGT-TRTDNTLLERQRGITIQ 57

Query: 69 SSVMQFDYDDKRVNILDTPGHEDFSEDTYRTLMAVDAAVMVVDSAKGIEAQTKKLFEVVK 128
+ + F +++ +VNI+DTPGH DF + YR+L +D A++++ + G++AQT+ LF ++
Sbjct: 58 TGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 129 HRGIPVFTFMNKLDRDGREPLDLLQELEEVLGIASYPMNWPIGMGKAFEGLYDLYNQRLE 188
GIP F+NK+D++G + + Q+++E L + + N +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIK----------QKVELYPNMCVT 167

Query: 189 LYKGDERFASLEDGDKLFGSNPFYEQVKDDIELLNEAGNEFSEEAILAGELTPVFFGSAL 248
+ E++ ++ +G+ + E+ L + L PV+ GSA
Sbjct: 168 NFTESEQWDTVIEGN-----DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAK 222

Query: 249 TNFGVQTFLETFLKFAPEPHGHKKTDGEIVDPYDKDFSGFVFKIQANMDPRHRDRIAFVR 308
N G+ +E + G VFKI+ + R R+A++R
Sbjct: 223 NNIGIDNLIEVITNKFYSS----------THRGQSELCGKVFKIE--YSEK-RQRLAYIR 269

Query: 309 IVSGEFERGMSVNLPRTGKGAKLSNVTQFMAESRENVTNAVAGDIIGVYDTG---TYQVG 365
+ SG SV + K K++ + + + A +G+I+ + + +G
Sbjct: 270 LYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLG 328

Query: 366 DTLTVGKNKFEFEPLPTFTPEIFMKVSAKNVMKQKSFHKGIEQLVQEG-AIQLYKNYQTG 424
DT + + + PLP + V +++ + ++ ++ Y + T
Sbjct: 329 DTKLLPQRERIENPLPL----LQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH 384

Query: 425 EYMLGAVGQLQFEVFKHRMEGEYNAEVVMSP 455
E +L +G++Q EV ++ +Y+ E+ +
Sbjct: 385 EIILSFLGKVQMEVTCALLQEKYHVEIEIKE 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1662ICENUCLEATIN922e-20 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 91.7 bits (227), Expect = 2e-20
Identities = 133/570 (23%), Positives = 247/570 (43%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S++S + + A ++++ S++ A ++ +A ++ A ++Q+A +S+
Sbjct: 166 STLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQM 225

Query: 1024 ASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1078
A ++Q+ T+ S T+ S+ + S T+ S+ T+ S T+
Sbjct: 226 AGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 285

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S T+ S T+ + S+ + S T+ +S T+ S T+ S T+ S T
Sbjct: 286 SDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGT 345

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S+ + S T+ S+ T+ S T+ S T+ S T+ + S+ +
Sbjct: 346 AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYG 405

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ +S T+ S T+ S T+ S T+ S+ + S T+ S+ T
Sbjct: 406 STQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLT 465

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S ST+ +S+ + S T+ S T+ S T+ +E
Sbjct: 466 AGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNE 525

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S + S ST+ + S+ + S T++ S T+ S T+ S T+ S T
Sbjct: 526 SDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGT 585

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S+S+ + S T++ S+ T+ S T+ +S T+ S ST+ ++S+ +
Sbjct: 586 AGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYG 645

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ S T+ S T+ S T+ S ST+ ++S+ + S T+ S T
Sbjct: 646 STQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 705

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T+ S TS S ST+ + S+
Sbjct: 706 AGYGSTQTAQEGSDLTSGYGSTSTAGADSS 735



Score = 91.7 bits (227), Expect = 2e-20
Identities = 133/570 (23%), Positives = 258/570 (45%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A +S+ A ++++ + +A ++ +A +S A ++Q+A +S +
Sbjct: 214 STQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLT 273

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAS-- 1081
A ++Q+A + +A ++ +A +S A ++Q+A ++++A ++Q+A
Sbjct: 274 AGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKG 333

Query: 1082 ---TSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
T+ S T+ S+ + S T+ S+ T+ S T+ S T+ S T
Sbjct: 334 SDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGT 393

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
+ S+ + S T+ +S T+ S T+ S T+ S T+ S+ +
Sbjct: 394 AGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYG 453

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S+ T+ S T+ S T+ S ST+ +S+ + S T+ S T
Sbjct: 454 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLT 513

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ ++S + S ST+ + S+ + S T++ S T+ S T+
Sbjct: 514 AGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG 573

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S T+ S+S+ + S T++ S+ T+ S T+ +S T+ S ST
Sbjct: 574 SDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 633

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ ++S+ + S T+ S T+ S T+ S T+ S ST+ ++S+ +
Sbjct: 634 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYG 693

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ S T+ S T+ S TS S ST+ ++S+ + S T++ S+ T
Sbjct: 694 STQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLT 753

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T+ +S T+ S ST+ + S+
Sbjct: 754 AGYGSTQTAREQSVLTTGYGSTSTAGADSS 783



Score = 91.4 bits (226), Expect = 2e-20
Identities = 132/570 (23%), Positives = 264/570 (46%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A +S +A ++++ + +A ++ +A +S A ++Q+A ++++
Sbjct: 262 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQT 321

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTS 1083
A ++Q+A + +A ++ +A +S A ++Q+A +S +A ++Q+A
Sbjct: 322 AGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 381

Query: 1084 ASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAS-----TSASKSAST 1138
+ +A ++ +A +S A ++Q+A ++++A ++Q+A T+ S T
Sbjct: 382 SDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGT 441

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S+ + S T+ S+ T+ S T+ S T+ S ST+ +S+ +
Sbjct: 442 AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYG 501

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S T+ ++S + S ST+ + S+ + S T++ S T
Sbjct: 502 STQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLT 561

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S T+ S S+ + S T++ S+ T+ S T+ +
Sbjct: 562 AGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQ 621

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S ST+ ++S+ + S T+ S T+ S T+ S T+ S ST
Sbjct: 622 SVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTST 681

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ ++S+ + S T+ S T+ S T+ S TS S ST+ ++S+ +
Sbjct: 682 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYG 741

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T++ S+ T+ S T+ +S T+ S ST+ ++S+ + S T+ S T
Sbjct: 742 STQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILT 801

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T+ S T+ S ST+ + S+
Sbjct: 802 AGYGSTQTAQERSDLTTGYGSTSTAGADSS 831



Score = 91.0 bits (225), Expect = 3e-20
Identities = 143/570 (25%), Positives = 254/570 (44%), Gaps = 8/570 (1%)

Query: 967 SASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1026
S T+ S+ T+ ST + S T+ S ST+ +S+ + S T+ S T
Sbjct: 454 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLT 513

Query: 1027 SASQSASTSASKSASTSASQSASTSASKS---ASTSASQSASTSASKSASTSASQSAS-- 1081
+ S T+ ++S + S ST+ + S A ++Q+AS ++ +A ++Q+A
Sbjct: 514 AGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG 573

Query: 1082 ---TSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
T+ S T+ S S+ + S T++ S+ T+ S T+ QS T+ S ST
Sbjct: 574 SDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 633

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
+ S+ + S T+ S T+ S T+ S T+ S ST+ + S+ +
Sbjct: 634 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYG 693

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S T+ S TS S ST+ + S+ + S T++ S+ T
Sbjct: 694 STQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLT 753

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ +S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 754 AGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQER 813

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S ST+ ++S+ + S T+ S T+ S T+ S T+ S ST
Sbjct: 814 SDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTST 873

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ +S+ + S T+ S T+ S T+ S T+ S ST+ ES+ +
Sbjct: 874 AGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYG 933

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T++ +S + S+ T+ +S+ T+ S S + +S+ + S T+ +S T
Sbjct: 934 STQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLT 993

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T+ S T+ S +T+ + S+
Sbjct: 994 AGYGSTQTAEHSSTLTAGYGSTATAGADSS 1023



Score = 91.0 bits (225), Expect = 3e-20
Identities = 143/570 (25%), Positives = 254/570 (44%), Gaps = 8/570 (1%)

Query: 967 SASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1026
S T+ S+ + ST + S+ T+ S T+ S T+ S ST+ +S+
Sbjct: 438 STGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLI 497

Query: 1027 SASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKS---ASTSASQSASTS 1083
+ S T+ S T+ S T+ ++S + S ST+ + S A ++Q+AS +
Sbjct: 498 AGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYN 557

Query: 1084 ASKSASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
+ +A ++Q+A T+ S T+ S S+ + S T++ S+ T+ S T
Sbjct: 558 SVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQT 617

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
QS T+ S ST+ + S+ + S T+ S T+ S T+ S T+
Sbjct: 618 AREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYG 677

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S ST+ + S+ + S T+ S T+ S T+ S TS S ST+ + S+
Sbjct: 678 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLI 737

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T++ S+ T+ S T+ +S T+ S ST+ ++S+ + S T+
Sbjct: 738 AGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYH 797

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S T+ S T+ S ST+ ++S+ + S T+ S T+ S T
Sbjct: 798 SILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQT 857

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S ST+ +S+ + S T+ S T+ S T+ S T+
Sbjct: 858 AQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYG 917

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S ST+ ES+ + S T++ +S + S+ T+ +S+ T+ S S + +S+
Sbjct: 918 STSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLI 977

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T+ +S T+ S T+ +ST
Sbjct: 978 AGYGSTQTAGYQSTLTAGYGSTQTAEHSST 1007



Score = 90.6 bits (224), Expect = 4e-20
Identities = 130/570 (22%), Positives = 255/570 (44%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A ++ A ++++ +S+ A ++Q+ + +A ++ +A +S
Sbjct: 198 STGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLI 257

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTS 1083
A ++Q+A +S +A ++Q+A + +A ++ +A +S A ++Q+A
Sbjct: 258 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEE 317

Query: 1084 ASKSASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
++++A ++Q+A T+ S T+ S+ + S T+ S+ T+ S T
Sbjct: 318 STQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQT 377

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S T+ S T+ + S+ + S T+ +S T+ S T+ S T+
Sbjct: 378 AQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYG 437

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S+ + S T+ S+ T+ S T+ S T+ S ST+ +S+
Sbjct: 438 STGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLI 497

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S T+ ++S + S ST+ + S+ + S T++
Sbjct: 498 AGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYN 557

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S T+ S T+ S T+ S+S+ + S T++ S+ T+ S T
Sbjct: 558 SVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQT 617

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ +S T+ S ST+ ++S+ + S T+ S T+ S T+ S T+
Sbjct: 618 AREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYG 677

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S ST+ ++S+ + S T+ S T+ S T+ S TS S ST+ ++S+
Sbjct: 678 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLI 737

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S T++ S+ T+ S T+ S
Sbjct: 738 AGYGSTQTASYHSSLTAGYGSTQTAREQSV 767



Score = 90.2 bits (223), Expect = 4e-20
Identities = 132/570 (23%), Positives = 261/570 (45%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A + +A ++ + +S A ++Q+A ++++A ++Q+A + +
Sbjct: 278 STQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLT 337

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTS 1083
A ++ +A +S A ++Q+A +S +A ++Q+A + +A ++ +A
Sbjct: 338 AGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGAD 397

Query: 1084 ASKSASTSASQSASTSASKSASTSASQSAS-----TSASKSASTSASQSASTSASKSAST 1138
+S A ++Q+A ++++A ++Q+A T+ S T+ S+ + S T
Sbjct: 398 SSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQT 457

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S+ T+ S T+ S T+ S ST+ +S+ + S T+ S T+
Sbjct: 458 AGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYG 517

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ ++S + S ST+ + S+ + S T++ S T+ S T+ S T
Sbjct: 518 STQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLT 577

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S S+ + S T++ S+ T+ S T+ +S T+ S ST+ ++
Sbjct: 578 AGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGAD 637

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S+ + S T+ S T+ S T+ S T+ S ST+ ++S+ + S T
Sbjct: 638 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQT 697

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S T+ S TS S ST+ ++S+ + S T++ S+ T+
Sbjct: 698 AGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYG 757

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ +S T+ S ST+ ++S+ + S T+ S T+ S T+ S T
Sbjct: 758 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLT 817

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S ST+ ++S+ + S T+ S
Sbjct: 818 TGYGSTSTAGADSSLIAGYGSTQTAGYNSI 847



Score = 89.4 bits (221), Expect = 7e-20
Identities = 137/577 (23%), Positives = 259/577 (44%), Gaps = 5/577 (0%)

Query: 956 ASTSLSKSSSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQS 1015
+S + S+ +A + +A ++ + +S A ++Q+A ++ +A ++Q+
Sbjct: 462 SSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQT 521

Query: 1016 ASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAS-----TSASK 1070
A + ++ +A ++S A ++Q+AS ++ +A ++Q+A T+
Sbjct: 522 AQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYG 581

Query: 1071 SASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAST 1130
S T+ S S+ + S T++ S+ T+ S T+ QS T+ S ST+ + S+
Sbjct: 582 STGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLI 641

Query: 1131 SASKSASTKASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASK 1190
+ S T S T+ S T+ S T+ S ST+ + S+ + S T+
Sbjct: 642 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYN 701

Query: 1191 SASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAST 1250
S T+ S T+ S TS S ST+ + S+ + S T++ S+ T+ S T
Sbjct: 702 SILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQT 761

Query: 1251 SASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASA 1310
+ +S T+ S ST+ + S+ + S T+ S T+ S T+ S T+
Sbjct: 762 AREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYG 821

Query: 1311 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESAST 1370
S ST+ ++S+ + S T+ S T+ S T+ S T+ S ST+ +S+
Sbjct: 822 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLI 881

Query: 1371 SASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASE 1430
+ S T+ S T+ S T+ S T+ S ST+ ES+ + S T++ +
Sbjct: 882 AGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFK 941

Query: 1431 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESAST 1490
S + S+ T+ +S+ T+ S S + +S+ + S T+ +S T+ S T
Sbjct: 942 STLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQT 1001

Query: 1491 SASESASTSASASASTSASESASTSASASASTSASAS 1527
+ S T+ S +T+ ++S+ + S+ TS S
Sbjct: 1002 AEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRS 1038



Score = 89.4 bits (221), Expect = 7e-20
Identities = 137/570 (24%), Positives = 253/570 (44%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A ++ +A ++++ + ++ +A ++S A ++Q+AS ++ +
Sbjct: 502 STQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLT 561

Query: 1024 ASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1078
A ++Q+A T+ S T+ S S+ + S T++ S+ T+ S T+ Q
Sbjct: 562 AGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQ 621

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S T+ S ST+ + S+ + S T+ S T+ S T+ S T+ S ST
Sbjct: 622 SVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTST 681

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
+ S+ + S T+ S T+ S T+ S TS S ST+ + S+ +
Sbjct: 682 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYG 741

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T++ S+ T+ S T+ +S T+ S ST+ + S+ + S T+ S T
Sbjct: 742 STQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILT 801

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ +S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 802 AGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEN 861

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S ST+ +S+ + S T+ S T+ S T+ S T+ S ST
Sbjct: 862 SDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTST 921

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ ES+ + S T++ +S + S+ T+ +S+ T+ S S + +S+ +
Sbjct: 922 AGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYG 981

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ +S T+ S T+ S T+ S +T+ ++S+ + S+ TS S T
Sbjct: 982 STQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLT 1041

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S S S T+ S+ S S+
Sbjct: 1042 AGYGSTLISGLRSVLTAGYGSSLISGRRSS 1071



Score = 89.4 bits (221), Expect = 8e-20
Identities = 129/569 (22%), Positives = 265/569 (46%), Gaps = 5/569 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ + + +A ++ + +S A ++Q+A +S +A ++Q+A + +
Sbjct: 230 STQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLT 289

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTS 1083
A ++ +A +S A ++Q+A ++++A ++Q+A + +A ++ +A
Sbjct: 290 AGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDD 349

Query: 1084 ASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTKASQS 1143
+S A ++Q+A +S +A ++Q+A + +A ++ +A +S A ++Q+
Sbjct: 350 SSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQT 409

Query: 1144 ASTSASQSASTSASKSAS-----TSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
A ++Q+A ++++A T+ S T+ S+ + S T+ S+ T+
Sbjct: 410 AGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYG 469

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S ST+ +S+ + S T+ S T+ S T+ ++S
Sbjct: 470 STQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLI 529

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S ST+ + S+ + S T++ S T+ S T+ S T+ S T+ S+
Sbjct: 530 TGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSD 589

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S+ + S T++ S+ T+ S T+ +S T+ S ST+ ++S+ + S T
Sbjct: 590 SSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQT 649

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S T+ S T+ S ST+ ++S+ + S T+ S T+
Sbjct: 650 AGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYG 709

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ S TS S ST+ ++S+ + S T++ S+ T+ S T+ +S T
Sbjct: 710 STQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLT 769

Query: 1499 SASASASTSASESASTSASASASTSASAS 1527
+ S ST+ ++S+ + S T+ S
Sbjct: 770 TGYGSTSTAGADSSLIAGYGSTQTAGYHS 798



Score = 89.4 bits (221), Expect = 9e-20
Identities = 141/573 (24%), Positives = 249/573 (43%), Gaps = 6/573 (1%)

Query: 959 SLSKSSSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSAST 1018
S + S+ T+ S T+ S + S T+ + S+ + S T+ +S T
Sbjct: 358 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQT 417

Query: 1019 SASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1078
+ S T+ S T+ S T+ S+ + S T+ S+ T+ S T+
Sbjct: 418 AGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 477

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S T+ S ST+ +S+ + S T+ S T+ S T+ ++S + S ST
Sbjct: 478 SDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTST 537

Query: 1139 KASQS---ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTS 1195
+ S A ++Q+AS ++ +A ++Q+A S T+ S T+ S S+ +
Sbjct: 538 AGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG---SDLTAGYGSTGTAGSDSSIIA 594

Query: 1196 ASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1255
S T++ S+ T+ S T+ +S T+ S ST+ + S+ + S T+ S
Sbjct: 595 GYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNS 654

Query: 1256 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTS 1315
T+ S T+ S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 655 ILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTA 714

Query: 1316 ASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASAS 1375
S TS S ST+ ++S+ + S T++ S+ T+ S T+ +S T+ S
Sbjct: 715 QEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGS 774

Query: 1376 ASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTS 1435
ST+ ++S+ + S T+ S T+ S T+ S T+ S ST+ ++S+ +
Sbjct: 775 TSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIA 834

Query: 1436 ASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASES 1495
S T+ S T+ S T+ S T+ S ST+ +S+ + S T+ S
Sbjct: 835 GYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNS 894

Query: 1496 ASTSASASASTSASESASTSASASASTSASAST 1528
T+ S T+ S T+ S ST+ S+
Sbjct: 895 ILTAGYGSTQTAQENSDLTTGYGSTSTAGYESS 927



Score = 89.4 bits (221), Expect = 9e-20
Identities = 137/570 (24%), Positives = 255/570 (44%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A +S A ++++ ++ +A ++Q+A + ++ +A ++S
Sbjct: 486 STSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLI 545

Query: 1024 ASTSASQSASTSASKSASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQ 1078
A ++Q+AS ++ +A ++Q+A T+ S T+ S S+ + S T++
Sbjct: 546 AGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYH 605

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S+ T+ S T+ QS T+ S ST+ + S+ + S T+ S T+ S T
Sbjct: 606 SSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQT 665

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S T+ S ST+ + S+ + S T+ S T+ S T+ S TS
Sbjct: 666 AQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYG 725

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S ST+ + S+ + S T++ S+ T+ S T+ +S T+ S ST+ + S+
Sbjct: 726 STSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLI 785

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S T+ +S T+ S ST+ ++S+ + S T+
Sbjct: 786 AGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYN 845

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S T+ S T+ S ST+ +S+ + S T+ S T+ S T
Sbjct: 846 SILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQT 905

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S ST+ ES+ + S T++ +S + S+ T+ +S+ T+
Sbjct: 906 AQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYG 965

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S S + +S+ + S T+ +S T+ S T+ S T+ S +T+ ++S+
Sbjct: 966 STSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLI 1025

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S+ TS S T+ S S S
Sbjct: 1026 AGYGSSLTSGIRSFLTAGYGSTLISGLRSV 1055



Score = 89.0 bits (220), Expect = 1e-19
Identities = 138/562 (24%), Positives = 247/562 (43%), Gaps = 6/562 (1%)

Query: 970 TSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSAS 1029
T+ S T+ S+ ++ S T+ S+ T+ S T+ S T+ S T+ +
Sbjct: 337 TAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGA 396

Query: 1030 QSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAS 1089
S+ + S T+ +S T+ S T+ S T+ S T+ S+ + S
Sbjct: 397 DSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQ 456

Query: 1090 TSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTKASQSASTSAS 1149
T+ S+ T+ S T+ S T+ S ST+ +S+ + S T S T+
Sbjct: 457 TAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGY 516

Query: 1150 QSASTSASKSASTSASQSASTSASKS---ASTSASQSASTSASKSASTSASQSASTSASK 1206
S T+ ++S + S ST+ + S A ++Q+AS ++ +A ++Q+A
Sbjct: 517 GSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG--- 573

Query: 1207 SASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAST 1266
S T+ S T+ S S+ + S T++ S+ T+ S T+ +S T+ S ST
Sbjct: 574 SDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 633

Query: 1267 SASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASESASTSASA 1326
+ + S+ + S T+ S T+ S T+ S T+ S ST+ ++S+ +
Sbjct: 634 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYG 693

Query: 1327 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESAST 1386
S T+ S T+ S T+ S TS S ST+ ++S+ + S T++ S+ T
Sbjct: 694 STQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLT 753

Query: 1387 SASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASE 1446
+ S T+ +S T+ S ST+ ++S+ + S T+ S T+ S T+
Sbjct: 754 AGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQER 813

Query: 1447 SASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESASTSASASAST 1506
S T+ S ST+ ++S+ + S T+ S T+ S T+ S T+ S ST
Sbjct: 814 SDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTST 873

Query: 1507 SASESASTSASASASTSASAST 1528
+ +S+ + S T+ S
Sbjct: 874 AGYDSSLIAGYGSTQTAGYNSI 895



Score = 88.7 bits (219), Expect = 1e-19
Identities = 142/570 (24%), Positives = 251/570 (44%), Gaps = 8/570 (1%)

Query: 967 SASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1026
S T+ S T+ ST + S+ + S T+ S+ T+ S T+ S T
Sbjct: 422 STQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLT 481

Query: 1027 SASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASK 1086
+ S ST+ +S+ + S T+ S T+ S T+ ++S + S ST+ +
Sbjct: 482 AGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGAN 541

Query: 1087 SASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKS---ASTKASQS 1143
S+ + S T++ S T+ S T+ S T+ S T+ S S A ++Q+
Sbjct: 542 SSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQT 601

Query: 1144 ASTSASQSASTSASKSAS-----TSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
AS +S +A ++++A T+ S ST+ + S+ + S T+ S T+
Sbjct: 602 ASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYG 661

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S ST+ + S+ + S T+ S T+ S T+ S T
Sbjct: 662 STQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLT 721

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
S S ST+ + S+ + S T++ S+ T+ S T+ +S T+ S ST+ ++
Sbjct: 722 SGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGAD 781

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S+ + S T+ S T+ S T+ S T+ S ST+ ++S+ + S T
Sbjct: 782 SSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQT 841

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S T+ S T+ S ST+ +S+ + S T+ S T+
Sbjct: 842 AGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYG 901

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ S T+ S ST+ ES+ + S T++ +S + S+ T+ +S+ T
Sbjct: 902 STQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLT 961

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S S + +S+ + S T+ ST
Sbjct: 962 AGYGSTSMAGYDSSLIAGYGSTQTAGYQST 991



Score = 88.7 bits (219), Expect = 1e-19
Identities = 132/577 (22%), Positives = 252/577 (43%), Gaps = 7/577 (1%)

Query: 956 ASTSLSKSSSISASTSASKSASTSASKSTSVSAS-KSASTSASQSASTSASKSASTSASQ 1014
A T + TS K + S + + A+ +S ST +Q+ + S + Q
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTLSGTHQ 173

Query: 1015 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1074
S + S T+ S + S T+ + S + S T+ +S+ + S T
Sbjct: 174 SQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQT 233

Query: 1075 SASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASK 1134
S T+ S T+ S+ + S T+ S+ T+ S T+ S T+
Sbjct: 234 GMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYG 293

Query: 1135 SASTKASQS---ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1191
S T + S A ++Q+A ++++A ++Q+A + +A ++ +A +S
Sbjct: 294 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 353

Query: 1192 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTS 1251
A ++Q+A +S +A ++Q+A + +A ++ +A +S A ++Q+A
Sbjct: 354 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEE 413

Query: 1252 ASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASAS 1311
++++A ++Q+A S T+ S T+ S+ + S T+ +S+ T+ S
Sbjct: 414 STQTAGYGSTQTAQKG---SDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 470

Query: 1312 ASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTS 1371
T+ S T+ S ST+ ES+ + S T+ S T+ S T+ +ES +
Sbjct: 471 TQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLIT 530

Query: 1372 ASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASES 1431
S ST+ + S+ + S T++ S T+ S T+ S T+ S T+ S+S
Sbjct: 531 GYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDS 590

Query: 1432 ASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTS 1491
+ + S T++ S+ T+ S T+ +S T+ S ST+ ++S+ + S T+
Sbjct: 591 SIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTA 650

Query: 1492 ASESASTSASASASTSASESASTSASASASTSASAST 1528
S T+ S T+ S T+ S ST+ + S+
Sbjct: 651 GYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSS 687



Score = 87.1 bits (215), Expect = 4e-19
Identities = 139/572 (24%), Positives = 250/572 (43%), Gaps = 6/572 (1%)

Query: 959 SLSKSSSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSAST 1018
S + + S+ + S T+ +ST + S T+ S T+ S T+ S+
Sbjct: 294 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 353

Query: 1019 SASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1078
+ S T+ S+ T+ S T+ S T+ S T+ + S+ + S T+ +
Sbjct: 354 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEE 413

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S T+ S T+ S T+ S T+ S+ + S T+ S+ T+ S T
Sbjct: 414 STQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQT 473

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S T+ S ST+ +S+ + S T+ S T+ S T+ ++S +
Sbjct: 474 AQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYG 533

Query: 1199 SASTSASKS---ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1255
S ST+ + S A ++Q+AS ++ +A ++Q+A S T+ S T+ S S
Sbjct: 534 STSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG---SDLTAGYGSTGTAGSDS 590

Query: 1256 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTS 1315
+ + S T++ S+ T+ S T+ +S T+ S ST+ ++S+ + S T+
Sbjct: 591 SIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTA 650

Query: 1316 ASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASAS 1375
S T+ S T+ S T+ S ST+ ++S+ + S T+ S T+ S
Sbjct: 651 GYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGS 710

Query: 1376 ASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTS 1435
T+ S TS S ST+ ++S+ + S T++ S+ T+ S T+ +S T+
Sbjct: 711 TQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTT 770

Query: 1436 ASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASES 1495
S ST+ ++S+ + S T+ S T+ S T+ S T+ S ST+ ++S
Sbjct: 771 GYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADS 830

Query: 1496 ASTSASASASTSASESASTSASASASTSASAS 1527
+ + S T+ S T+ S T+ S
Sbjct: 831 SLIAGYGSTQTAGYNSILTAGYGSTQTAQENS 862



Score = 84.8 bits (209), Expect = 2e-18
Identities = 136/570 (23%), Positives = 252/570 (44%), Gaps = 5/570 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +AS ++ +A ++++ + +A ++ +A + +S A ++Q+AS +S +
Sbjct: 550 STQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLT 609

Query: 1024 ASTSASQSAS-----TSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1078
A ++Q+A T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 610 AGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEG 669

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S T+ S ST+ + S+ + S T+ S T+ S T+ S TS S ST
Sbjct: 670 SDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTST 729

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
+ S+ + S T++ S+ T+ S T+ +S T+ S ST+ + S+ +
Sbjct: 730 AGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYG 789

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S T+ +S T+ S ST+ + S+ + S T+ S T
Sbjct: 790 STQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 849

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S ST+ S+ + S T+ S T+ S T+
Sbjct: 850 AGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEN 909

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S T+ S ST+ ES+ + S T++ +S + S+ T+ +S+ T+ S S
Sbjct: 910 SDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSM 969

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ +S+ + S T+ +S T+ S T+ S T+ S +T+ ++S+ +
Sbjct: 970 AGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYG 1029

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S+ TS S T+ S S S T+ S+ S S+ T+ S ++ S+
Sbjct: 1030 SSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLI 1089

Query: 1499 SASASASTSASESASTSASASASTSASAST 1528
+ S + + S + S+ T+ ST
Sbjct: 1090 AGPESTQITGNRSMLIAGKGSSQTAGYRST 1119



Score = 80.2 bits (197), Expect = 5e-17
Identities = 137/569 (24%), Positives = 244/569 (42%), Gaps = 8/569 (1%)

Query: 967 SASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1026
S T+ S S+ + ST ++ S+ T+ S T+ +S T+ S ST+ + S+
Sbjct: 582 STGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLI 641

Query: 1027 SASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASK 1086
+ S T+ S T+ S T+ S T+ S ST+ + S+ + S T+
Sbjct: 642 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYN 701

Query: 1087 SASTSASQSASTSASKSASTSASQSASTSASKS---ASTSASQSASTSASKSASTKASQS 1143
S T+ S T+ S TS S ST+ + S A ++Q+AS +S +A ++Q+
Sbjct: 702 SILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQT 761

Query: 1144 AS-----TSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
A T+ S ST+ + S+ + S T+ S T+ S T+ +S T+
Sbjct: 762 AREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYG 821

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S ST+ + S+ + S T+ S T+ S T+ S T+ S ST+ S+
Sbjct: 822 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLI 881

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S T+ S T+ S ST+ ES+ + S T++ +
Sbjct: 882 AGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFK 941

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S + S+ T+ +S+ T+ S S + +S+ + S T+ +S T+ S T
Sbjct: 942 STLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQT 1001

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ S T+ S +T+ ++S+ + S+ TS S T+ S S S T+
Sbjct: 1002 AEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYG 1061

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S+ S S+ T+ S ++ S+ + S + + S + S+ T+ S
Sbjct: 1062 SSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLI 1121

Query: 1499 SASASASTSASESASTSASASASTSASAS 1527
S + S + + + S T+ S
Sbjct: 1122 SGADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 78.6 bits (193), Expect = 2e-16
Identities = 135/569 (23%), Positives = 243/569 (42%), Gaps = 8/569 (1%)

Query: 967 SASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1026
S T++ S+ T+ ST + +S T+ S ST+ + S+ + S T+ S T
Sbjct: 598 STQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 657

Query: 1027 SASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASK 1086
+ S T+ S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 658 AGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEG 717

Query: 1087 SASTSASQSASTSASKS---ASTSASQSASTSASKSASTSASQSAS-----TSASKSAST 1138
S TS S ST+ + S A ++Q+AS +S +A ++Q+A T+ S ST
Sbjct: 718 SDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 777

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
+ S+ + S T+ S T+ S T+ +S T+ S ST+ + S+ +
Sbjct: 778 AGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYG 837

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S T+ S T+ S ST+ S+ + S T+ S T
Sbjct: 838 STQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILT 897

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S T+ S T+ S ST+ +S+ + S T++ +S + S+ T+ +
Sbjct: 898 AGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQ 957

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S+ T+ S S + +S+ + S T+ +S T+ S T+ S T+ S +T
Sbjct: 958 SSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTAT 1017

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
+ ++S+ + S+ TS S T+ S S S T+ S+ S S+ T+
Sbjct: 1018 AGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYG 1077

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S ++ S+ + S + + S + S+ T+ S S ++S +
Sbjct: 1078 SNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLI 1137

Query: 1499 SASASASTSASESASTSASASASTSASAS 1527
+ + S T+ S + + S T+ S
Sbjct: 1138 AGADSTQTAGDRSKLLAGNNSYLTAGDRS 1166



Score = 70.2 bits (171), Expect = 6e-14
Identities = 120/553 (21%), Positives = 233/553 (42%), Gaps = 5/553 (0%)

Query: 964 SSISASTSASKSASTSASKSTSVSASKSASTSASQSASTSASKSASTSASQSASTSASKS 1023
S+ +A + +A ++ + +S A ++Q+A ++ +A ++Q+A + +
Sbjct: 662 STQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLT 721

Query: 1024 ASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSAS-----TSASKSASTSASQ 1078
+ ++ +A +S A ++Q+AS +S +A ++Q+A T+ S ST+ +
Sbjct: 722 SGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGAD 781

Query: 1079 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1138
S+ + S T+ S T+ S T+ +S T+ S ST+ + S+ + S T
Sbjct: 782 SSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQT 841

Query: 1139 KASQSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQ 1198
S T+ S T+ S T+ S ST+ S+ + S T+ S T+
Sbjct: 842 AGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYG 901

Query: 1199 SASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSASTSASQSASTSASKSAST 1258
S T+ S T+ S ST+ +S+ + S T++ KS + S+ T+ +S+ T
Sbjct: 902 STQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLT 961

Query: 1259 SASQSASTSASKSASTSASQSASTSASKSASTSASASASTSASESASTSASASASTSASE 1318
+ S S + S+ + S T+ +S T+ S T+ S T+ S +T+ ++
Sbjct: 962 AGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGAD 1021

Query: 1319 SASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASASAST 1378
S+ + S+ TS S T+ S S S T+ S+ S S+ T+ S
Sbjct: 1022 SSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQI 1081

Query: 1379 SASESASTSASASASTSASESASTSASASASTSASESASTSASASASTSASESASTSASA 1438
++ S+ + S + + S + S+ T+ S S + S + + +
Sbjct: 1082 ASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGAD 1141

Query: 1439 SASTSASESASTSASASASTSASESASTSASASASTSASESASTSASESASTSASESAST 1498
S T+ S + + S T+ S T+ + + S T+ S T+ S
Sbjct: 1142 STQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTAGINSILTAGCRSKLI 1201

Query: 1499 SASASASTSASES 1511
++ S T+ S
Sbjct: 1202 GSNGSTLTAGENS 1214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1653SECYTRNLCASE1393e-39 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 139 bits (352), Expect = 3e-39
Identities = 85/387 (21%), Positives = 175/387 (45%), Gaps = 40/387 (10%)

Query: 11 KKGLYTLFLLFIYVLGSRITLPFVDLNSKDFL----GGSTAYLAFSAALTGGNLRSLSIF 66
KK L+TL ++ +Y +G+ I +P VD + G+ +GG L ++IF
Sbjct: 16 KKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMFSGGALLQITIF 75

Query: 67 SVGLSPWMSAMILWQMFSYS----KKLGLSSTAIEIQDRRKM-----YLTLLIAVIQSLA 117
++G+ P+++A I+ Q+ + + L A K+ YLT+ +A++Q
Sbjct: 76 ALGIMPYITASIILQLLTVVIPRLEALKKEGQA----GTAKITQYTRYLTVALAILQGTG 131

Query: 118 VSLSLPVQSSY------------SAILVVLMNTLLLIAGTFFLVWLSDLNASMGIG-GSI 164
+ + + +I + + + AGT ++WL +L GIG G
Sbjct: 132 LVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWLGELITDRGIGNGMS 191

Query: 165 VILLSSIVLNIPQDVIETFQTVHIPTGIIVLLALLTLVFSYLLAIMY--RARYLVPVN-- 220
+++ SI P + + + G I ++ + + +++ +A+ +PV
Sbjct: 192 ILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVVFVEQAQRRIPVQYA 251

Query: 221 --KIGLHNRFKRYSYLEIMLNPAGGMPYMYVMSFLSVPAYLFILLGFIFPNHAGLAALSK 278
IG + +Y+ + +N AG +P ++ S L +PA + F N + + +
Sbjct: 252 KRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALV---AQFAGGNSGWKSWVEQ 308

Query: 279 EFMVG-KPLWVYVYISVLFLFSIIFAFVTMNGEEIADRMKKSGEYIYGIYPGEDTSRFIN 337
G P+++ Y ++ F+ + ++ N EE+AD MKK G +I GI G T+ +++
Sbjct: 309 NLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGIRAGRPTAEYLS 368

Query: 338 GLVLRFSVIGGLFNVIMAGGPMLFVLF 364
++ R + G L+ ++A P + ++
Sbjct: 369 YVLNRITWPGSLYLGLIALVPTMALVG 395


9smi_1637smi_1598Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_16370253.162604hypothetical protein
smi_1636222-2.763350CorA-like Mg2+ transporter protein
smi_1635016-1.685108hypothetical protein
smi_1634012-1.786964hypothetical protein
smi_1633-112-1.529383hypothetical protein
smi_1632012-1.999832nicotinic acid mononucleotide
smi_1631011-0.811149hypothetical protein
smi_16300113.074233hypothetical protein
smi_1629-1112.904468hypothetical protein
smi_1628-2133.677080hypothetical protein
smi_1627-3143.713905hypothetical protein
smi_1626-2173.758009hypothetical protein
smi_1625-3163.064860hypothetical protein
smi_1624-2182.578073hypothetical protein
smi_1623-2193.041796guanylate kinase
smi_1622-2193.309399DNA-dependent RNA Polymerase, omega subunit
smi_1621-1172.744314primosomal protein N'
smi_16200192.942857methionyl-tRNA formyltransferase
smi_16190223.17304216S rRNA (cytosine(967)-C(5))-methyltransferase
smi_16180233.494819serine/threonine phosphatase
smi_16171273.130412serine/threonine protein kinase
smi_16161220.334981hypothetical protein
smi_1615120-2.183703hypothetical protein
smi_1614-113-4.0324513-hydroxy-3-methylglutaryl CoA synthase
smi_1613014-4.3561293-hydroxy-3-methylglutaryl-coenzyme a reductase
smi_1612015-5.404927transcriptional regulator
smi_1611018-6.031821glycosyl hydrolases family 32
smi_1610015-5.011349PTS system
smi_1609-114-3.620554transcriptional regulator/sugar kinase
smi_1608015-3.566224hypothetical protein
smi_1607-217-3.660064hypothetical protein
smi_1606-215-2.737186hypothetical protein
smi_1605-112-1.029643hypothetical protein
smi_1604-114-0.123035type I restriction-modification system DNA
smi_16030242.720857hypothetical protein
smi_16020233.067510restriction endonuclease S subunit
smi_16010202.404796restriction endonuclease S subunit
smi_16001213.337072type I restriction-modification system
smi_15990183.708720type I restriction-modification system, helicase
smi_1598-1193.576812hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1637LPSBIOSNTHSS280.019 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 27.9 bits (62), Expect = 0.019
Identities = 16/72 (22%), Positives = 32/72 (44%), Gaps = 5/72 (6%)

Query: 27 GILGGNFNPVHNAHLVVADQVRQQLGLDQVLLMPEYQPPHVDKKETIPEHHRLKMLELAI 86
I G+F+P+ HL + ++ + DQV + P +K+ RL+ + AI
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRL--FDQVYVAVLRNP---NKQPMFSVQERLEQIAKAI 57

Query: 87 EGIEGLSIETIE 98
+ +++ E
Sbjct: 58 AHLPNAQVDSFE 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1630RTXTOXIND417e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 7e-06
Identities = 21/192 (10%), Positives = 56/192 (29%), Gaps = 18/192 (9%)

Query: 38 NAEQEATNLRGQAEREADLLVNEAKRESKSLKKEALLEAKEEARKYREEVDAEFKSERQE 97
+ R Q + L + + + +E R + +F + + +
Sbjct: 143 LLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLR-LTSLIKEQFSTWQNQ 201

Query: 98 LKQIESRLTERATSLDRKDDNLTSKEQTLEQKEQSISDRAK----------NLDAREEQL 147
Q E L ++ + E ++ + D + + +E +
Sbjct: 202 KYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKY 261

Query: 148 EEVERQKEAELERIG----ALSQAEARDIILAQTEENLTKEIASRIREAEQEVKERSDKM 203
E + ++ + A+ + EI ++R+ + + ++
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEE---YQLVTQLFKNEILDKLRQTTDNIGLLTLEL 318

Query: 204 AKDILVQAMQRI 215
AK+ Q I
Sbjct: 319 AKNEERQQASVI 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1609BONTOXILYSIN310.017 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 30.6 bits (69), Expect = 0.017
Identities = 15/77 (19%), Positives = 27/77 (35%), Gaps = 1/77 (1%)

Query: 404 TTILVLAKNKLENKTLFIDAS-KEFKKETNNNVLTDSNIEHIVELFSNYQNVDYKSALVG 462
T I K+ L N F K ++ N + + E+ S YQ + + S
Sbjct: 787 TNINDNEKSILINSYTFKTIDFKFLDIQSIKNFFNSQVEQVMKEILSPYQLLLFASKGPN 846

Query: 463 NDVIGSEQDYNLSVSTY 479
+++I N +
Sbjct: 847 SNIIEDISGKNTLIQYT 863


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1608YERSSTKINASE300.019 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.7 bits (66), Expect = 0.019
Identities = 42/172 (24%), Positives = 74/172 (43%), Gaps = 32/172 (18%)

Query: 146 IRSSEKVFYRQVLDLFATSSDYNANSPEAKKFFATVQNKMHYAIHHNTASELIYNRIDSE 205
I E + R + D+ S+D +S EA ++H + T E +S
Sbjct: 374 IAGVETAYTRFITDILGVSADSRPDSNEA---------RLHEFLSDGTIDE------ESA 418

Query: 206 KEFLGLTTFKGDLPTLSEAKVAKNYLTEKELRGLNQLVSGYLDFAERQAEREEVMTMADW 265
K+ L T G++ LS +T K+LR L+ L+ +L A + + M
Sbjct: 419 KQILK-DTLTGEMSPLS---TDVRRITPKKLRELSDLLRTHLSSAATKQ-----LDMGGV 469

Query: 266 VTHVDRILLATGEDLLDNSGSISREQMEHKVDKEYKSYQAKTLSQVEKDYLK 317
++ +D +L+A D + G + ++Q+ K + S KT +E DY+K
Sbjct: 470 LSDLDTMLVAL--DKAEREGGVDKDQL-----KSFNSLILKTYRVIE-DYVK 513


10smi_1560smi_1525Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_15601213.004932hypothetical protein
smi_15590213.400308triose phosphate isomerase
smi_1558-1203.835755choline-binding protein LytC,
smi_1557-2193.400308DPS family peroxide resistance protein
smi_1556-1183.044201dihydrofolate reductase
smi_1555-1163.140598hypothetical protein
smi_1554-2142.417391ATP-dependent Clp protease ATP-binding subunit
smi_1553-1131.526376hypothetical protein
smi_1552-1151.546870hypothetical protein
smi_1551-2131.746548P-loop-containing kinase
smi_1550-1152.686701hypothetical protein
smi_15490202.398753hypothetical protein
smi_1548-2173.046467thioredoxin reductase
smi_1547-1183.553946hypothetical protein
smi_1546-2193.238602hypothetical protein
smi_1545-2203.249563phosphoglucosamine mutase
smi_1544-2192.795771hypothetical protein
smi_15431171.565527hypothetical protein
smi_15422200.9345574-hydroxy-tetrahydrodipicolinate reductase
smi_15412190.786079tRNA nucleotidyltransferase
smi_15402180.361601ATPase component of ABC transporter with
smi_15392160.261605cation efflux family protein
smi_15382160.277201yybP-ykoY element
smi_1537114-0.326347cation-transporting ATPase, E1-E2 family
smi_15361170.535620polypeptide deformylase
smi_15351190.694375hypothetical protein
smi_15341190.847732hypothetical protein
smi_15332231.316970hypothetical protein
smi_15322221.684477cell wall surface anchor family protein
smi_15312232.587496N-acetyl-beta-hexosaminidase
smi_15300142.525274transposase
smi_1529-1172.955842transposase, IS1167
smi_1528-1183.666721beta-galactosidase
smi_1527-1193.887869hypothetical protein
smi_15263232.965771transcriptional regulator
smi_15252192.446255cell wall surface anchor family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1555FERRIBNDNGPP290.019 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 29.1 bits (65), Expect = 0.019
Identities = 18/65 (27%), Positives = 25/65 (38%), Gaps = 3/65 (4%)

Query: 141 GTEVAGESHIADHPGMIDHVYVTNTLNDDTPLASRRVVQTILESDMIVLGPGSLFTSILP 200
+ A E+H+A + I + PL + I M+V GP SLF IL
Sbjct: 146 NLQSAAETHLAQYEDFIRSMKPRFVKRGARPLL---LTTLIDPRHMLVFGPNSLFQEILD 202

Query: 201 NIVIK 205
I
Sbjct: 203 EYGIP 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1540FLGMRINGFLIF303e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.9 bits (67), Expect = 3e-04
Identities = 9/28 (32%), Positives = 14/28 (50%)

Query: 32 KKDKFLSILTSLAGIALVLVAVWLGWPK 59
++ F+ L + LVLV W+ W K
Sbjct: 450 QQQSFIDQLLAAGRWLLVLVVAWILWRK 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1538V8PROTEASE741e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 74.3 bits (182), Expect = 1e-15
Identities = 68/344 (19%), Positives = 127/344 (36%), Gaps = 40/344 (11%)

Query: 86 PKTEEELLAKEKETATSSAVSDTLPEELRGKLNKAEENGRTASKEELEKEDK--SLVPED 143
T L++ A SS D P++ + + + + + + LE+ + ++P +
Sbjct: 15 TLTTATLVSSPAANALSSKAMDNHPQQTQSSKQQTPKIQKGGNLKPLEQREHANVILPNN 74

Query: 144 ----VAKTKNGVLNYGATVEIKSSAG----LGSGIVIGENLVLSVSHNFIKDVPDGNNRK 195
+ T NG +Y I+ A + SG+V+G++ +L+ H D +
Sbjct: 75 DRHQITDTTNG--HYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPH-AL 131

Query: 196 VADNVESDGDVYTVSYKGAPDVKFSKNDVKHWDREGFLKGYKNDLAIVKL------RTPL 249
A + D Y P+ F+ + + EG DLAIVK +
Sbjct: 132 KAFPSAINQDNY-------PNGGFTAEQITKYSGEG-------DLAIVKFSPNEQNKHIG 177

Query: 250 ANAPVEVIDKPSTIKVGDKVHVFGYPKGELDPILNTTVEDINNHGEGVRGISYQGS-EPG 308
+ + +V + V GYP + + + I + Y S G
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKIT--YLKGEAMQYDLSTTGG 235

Query: 309 ASGGGIFDENGKLIGIHQNGVSGKRSGGILFSPAQLEWIQNYIKGIETTKPAGLDALDKQ 368
SG +F+E ++IGIH GV + +G + + ++N++K D
Sbjct: 236 NSGSPVFNEKNEVIGIHWGGVPNEFNGAVFINEN----VRNFLKQNIEDIHFANDDQPNN 291

Query: 369 VEDKEEKPKEDKPQEEKPADNKPAENKPADNKPAENKPADNKPA 412
++ + D P +N N P + +N +DN A
Sbjct: 292 PDNPDNPNNPDNPNNPDEPNNPDNPNNPDNPDNGDNNNSDNPDA 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1537GPOSANCHOR394e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.5 bits (89), Expect = 4e-04
Identities = 17/74 (22%), Positives = 29/74 (39%), Gaps = 4/74 (5%)

Query: 11 HYSIRKFTIGAASVMIGASIFGAGMVQA----AETEGTAETEGTVTQAQPLDKLPADIAA 66
HYS+RK G ASV + ++ GAG+V + ++T+ + DK +
Sbjct: 9 HYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENNT 68

Query: 67 AIEKAEASAGTVDG 80
K +
Sbjct: 69 LKLKNSDLSFNNKA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1534GPOSANCHOR501e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 49.7 bits (118), Expect = 1e-07
Identities = 39/203 (19%), Positives = 72/203 (35%), Gaps = 5/203 (2%)

Query: 11 DKRCHYSIRKFAIGVASVMIGASIFGIS-AVQAEEAASSNTQTEETTVHQAQP-LDKLPD 68
+ HYS+RK G ASV + ++ G V E ++ T+++ T+ + Q DK
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 69 DVAAAIAKADENGGR-EFVKPKSELAEDKVTKDTETTRPANDGSHELASPKVETPNKVEE 127
+ K + + +K ++ ++++ E R + E AS E + +
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 128 GNKAEDKQKSEEANPKPVESAVTAGTEVRDDAKKTSEKDQVKQTTDIKSSSEKTQALSKE 187
KA + + + A K EK + S K + L E
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 188 SSKADVEKEKQLLSDRKQDFNKD 210
KA +E + L +
Sbjct: 185 --KAALEARQAELEKALEGAMNF 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1532PF050432921e-95 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 292 bits (748), Expect = 1e-95
Identities = 201/488 (41%), Positives = 299/488 (61%), Gaps = 2/488 (0%)

Query: 1 MRNLLSTKDQRQLRLMETLIQNRNWLRLHELADKLGCTERILKSDLNELRTAFPTIDIQS 60
MR+LLS K RQL L+E L +++ W ELA+ L CTER +K DL+ +++AFP + S
Sbjct: 1 MRDLLSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHS 60

Query: 61 SINGIMIDLNMQTSVEDVYQHFLAHSQSFQLLEYLFFNEGLPIYRTLENLHSSRANLYRL 120
S NGI I + +E VY HF HS F +LE++FFNEG + + S ++LYR+
Sbjct: 61 STNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGCQAESICKEFYISSSSLYRI 120

Query: 121 GRNITKTLSTQFQIELSFTPSEIRGNEIDIRYFYAQYFSERYYFLDWPFPYIPEEDLTEF 180
I K + QFQ E+S TP +I GNE DIRYF+AQYFSE+YYFL+WPF E L++
Sbjct: 121 ISQINKVIKRQFQFEVSLTPVQIIGNERDIRYFFAQYFSEKYYFLEWPFENFSSEPLSQL 180

Query: 181 ADFFYKITNYPMHFSIYRMYKLMLAISIYRIKNGHFIDLPNH-FYDEYYPLLMGIPNFEE 239
+ YK T++PM+ S +RM KL+L ++YRIK GHF+++ F D+ LM E
Sbjct: 181 LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSLDFLMQAEGIEG 240

Query: 240 TLVYFSEKLGLEITPDIIAQIFISFIQNNLFLDPQEFLNSLEENSEARYSYQLLSQILER 299
F + + + +++ Q+F+S+ Q F+D F+ ++++S SY LLS +++
Sbjct: 241 VAQSFESEYNISLDEEVVCQLFVSYFQKMFFIDESLFMKCVKKDSYVEKSYHLLSDFIDQ 300

Query: 300 LSKQYQITFTNHDELIWHLHNTAFFESQEIFSTPILFEQKTLTIKKFEVYFPDFMASARQ 359
+S +YQI N D LIWHLHNTA QE+F+ ILF+QK TI+ F+ FP F++ ++
Sbjct: 301 ISVKYQIEIENKDNLIWHLHNTAHLYRQELFTEFILFDQKGNTIRNFQNIFPKFVSDVKK 360

Query: 360 ELAQYRQAIGKHDHPEQLEHLMYTILTHAENLSTLLLENRPPIKVLIISNFDHALSLTFV 419
EL+ Y + + + HL YT +TH ++L LL+N+P +KVL++SNFD +
Sbjct: 361 ELSHYLETLEVCSSSMMVNHLSYTFITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVA 420

Query: 420 DMLSYYCNNRFIFDIWDELKTSPEILNQTDYDIIVSNFYIPGI-TKKFICRNHLSIMDLV 478
+ LSYYC+N F ++W EL+ S E L + YDII+SNF IP I K+ I N+++ + L+
Sbjct: 421 ETLSYYCSNNFELEVWTELELSKESLEDSPYDIIISNFIIPPIENKRLIYSNNINTVSLI 480

Query: 479 NHLNTLSN 486
LN +
Sbjct: 481 YLLNAMMF 488


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1531TONBPROTEIN512e-08 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 51.1 bits (122), Expect = 2e-08
Identities = 26/81 (32%), Positives = 31/81 (38%), Gaps = 4/81 (4%)

Query: 2882 PAQPTPNVPIPEVPVK-PVPAQPTPNVPTPEVPVQPTPVVPTPEVPVKPVPAVPEQP--- 2937
+P V P PV P P P E PV P P+ KPV V EQP
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 2938 VVPTPAQPATPVNANPVVSTT 2958
V P ++PA+P T
Sbjct: 114 VKPVESRPASPFENTAPARLT 134



Score = 41.1 bits (96), Expect = 3e-05
Identities = 22/78 (28%), Positives = 26/78 (33%), Gaps = 1/78 (1%)

Query: 2884 QPTPNVPIPEVPVKPVPAQPTPNVPTPEVP-VQPTPVVPTPEVPVKPVPAVPEQPVVPTP 2942
P P PI V P +P V P P V+P P P K P V E+P
Sbjct: 38 LPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 97

Query: 2943 AQPATPVNANPVVSTTVK 2960
+P VK
Sbjct: 98 PKPKPVKKVQEQPKRDVK 115



Score = 39.6 bits (92), Expect = 1e-04
Identities = 26/100 (26%), Positives = 31/100 (31%), Gaps = 9/100 (9%)

Query: 2868 PEVPEVPRLDIPTVPAQPTPNVPIPEVPV---KPVPAQPTPNVPTPEVPVQPTPVVPTPE 2924
V P + P P E PV KP P P +V QP V
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK--- 115

Query: 2925 VPVKPVPAVPEQPVVPTPAQPATPVNA--NPVVSTTVKEN 2962
PV+ PA P + P +T A PV S
Sbjct: 116 -PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPR 154



Score = 33.8 bits (77), Expect = 0.007
Identities = 16/52 (30%), Positives = 19/52 (36%), Gaps = 1/52 (1%)

Query: 2893 EVPVKPVPAQPTPNVPTPEVPVQPTPVVPTPEVPVK-PVPAVPEQPVVPTPA 2943
+V P PAQP ++P V P PV P P P P A
Sbjct: 34 QVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEA 85



Score = 32.7 bits (74), Expect = 0.016
Identities = 30/135 (22%), Positives = 38/135 (28%), Gaps = 17/135 (12%)

Query: 2811 VTPSNDKPVPPTPNMPEGPKFAMPEPPVHELPEFNGGVPGMPEVHELPEFNSGVPGMPEV 2870
VTP++ +P PE PEP P P V E P+ P
Sbjct: 50 VTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-----KEAPVVIEKPKPKPKPKPKPV- 103

Query: 2871 PEVPRLDIPTVPAQPTPNV-PIPEVPVKPVPAQPTPNVPTPEVPVQPTPVVPTPEVPVKP 2929
V QP +V P+ P P + + P V P
Sbjct: 104 --------KKVQEQPKRDVKPVESRPASPFENTAPARLTSS--TATAATSKPVTSVASGP 153

Query: 2930 VPAVPEQPVVPTPAQ 2944
QP P AQ
Sbjct: 154 RALSRNQPQYPARAQ 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1529TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 0.001
Identities = 48/313 (15%), Positives = 106/313 (33%), Gaps = 14/313 (4%)

Query: 43 GLLESIFHTTSLLCEIPSGMLADRYSYKTNLYLSRIAGIVSSILMLAGQGNFWIYALAMA 102
G+L +++ C G L+DR+ + L +S V +M W+ +
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATA-PFLWVLYIGRI 104

Query: 103 VSALSYNFDSGTSAAMVYDSAVEAGLKERYLSISSFLSGVSEGTQSLGTVLAGFFVHGQL 162
V+ ++ +G A + + R+ F+S G VL G
Sbjct: 105 VAGITGA--TGAVAGAYIADITDGDERARHF---GFMSACFGFGMVAGPVLGGLMGGFSP 159

Query: 163 HLTYYIMIATSIIVLFLIWMLKEPSVKVEKADSVTMKQIMWTVKDELKRN-----PMLFN 217
H ++ A + + L S K E+ + + + R ++
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESHKGER-RPLRREALNPLASFRWARGMTVVAALMAV 218

Query: 218 WMILSQIVGVLMCMFYFYYQNQLPDLSGWQISAVMLLGSLLNIVA-VYLASKIGKNYAAL 276
+ I+ + V ++ + +++ I + +L+ +A + +
Sbjct: 219 FFIMQLVGQVPAALWVIFGEDRF-HWDATTIGISLAAFGILHSLAQAMITGPVAARLGER 277

Query: 277 RLFPILVLLTGVTYMLSYFGTPLIYILIYLISNALHALFQPIFDNDLQGRLPSEVRATML 336
R + ++ G Y+L F T ++ A + P L ++ E + +
Sbjct: 278 RALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQ 337

Query: 337 SVYSMMFSLSMIV 349
+ + SL+ IV
Sbjct: 338 GSLAALTSLTSIV 350


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1527FLGPRINGFLGI290.027 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 29.1 bits (65), Expect = 0.027
Identities = 8/21 (38%), Positives = 10/21 (47%)

Query: 31 DILSLTLGEPDFTTPKNIQDA 51
L L L PDF+T + D
Sbjct: 191 VNLVLQLRNPDFSTAVRVADV 211


11smi_1479smi_1457Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1479-2153.309143response regulator
smi_1478-2163.299106histidine kinase
smi_1477-2213.516994hypothetical protein
smi_1476-1223.716560zinc metalloprotease
smi_1475-1202.033424chorismate binding enzyme para-aminobenzoate
smi_1474-1191.274010hypothetical protein
smi_1473-116-1.012285choline binding protein Cbp11
smi_1472-116-1.645420glucose kinase
smi_1471020-4.075066thymidylate synthase
smi_1470026-6.404221hypothetical protein
smi_1469228-7.078677tRNA isopentenylpyrophosphate transferase
smi_1468128-7.108413hypothetical protein
smi_1467328-6.483487hypothetical protein
smi_1466224-5.705602hydrolase
smi_1465221-5.133764dehydrogenase
smi_1464118-4.333784transcriptional regulator
smi_1463-116-4.217960hypothetical protein
smi_1462-117-3.108281hypothetical protein
smi_1461-317-2.814981choline binding protein Cbp10
smi_1460-317-3.628577hypothetical protein
smi_1459-216-2.854791hypothetical protein
smi_1458114-0.817238hypothetical protein
smi_14572190.086535ATPases of the AAA+ class
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1478PF03309353e-04 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 35.1 bits (81), Expect = 3e-04
Identities = 25/126 (19%), Positives = 46/126 (36%), Gaps = 14/126 (11%)

Query: 5 IIGIDLGGTSIKFAILTTAGEIQE---KWSIKTNILDEGSHIVDDMIESIQHRLDLLGVA 61
++ ID+ T +++ +G+ + +W I+T D++ +I L+G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTA----DELALTI---DGLIGDD 54

Query: 62 AADFQGIGMGSPGVVDREKGTVIGAYNLNWKTLQPIKEKIEKALGIPFFIDNDANVAALG 121
A G S V V W + + + GIP +DN V A
Sbjct: 55 AERLTGASGLS--TVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGA-- 110

Query: 122 ERWMGA 127
+R +
Sbjct: 111 DRIVNC 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1471DHBDHDRGNASE821e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 81.6 bits (201), Expect = 1e-20
Identities = 50/182 (27%), Positives = 88/182 (48%), Gaps = 6/182 (3%)

Query: 4 ILITGASGGLAQEMVKLLPND--QLILLGRNKEKLAQLYGKHP----HAEWIEIDITDDS 57
ITGA+ G+ + + + L + + + N EKL ++ HAE D+ D +
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 58 ALETLVADLYLRYGKIDVLINNAGYGIFEEFDQISDQDIHQMFEVNTFALMNLSRCLAAR 117
A++ + A + G ID+L+N AG +SD++ F VN+ + N SR ++
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 118 MKESRKGHIINIVSMAGLIATGKSSLYSATKFAAIGFSNALRLELMPHGVYVTTVNPGPI 177
M + R G I+ + S + + Y+++K AA+ F+ L LEL + + V+PG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 178 RT 179
T
Sbjct: 191 ET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1463HTHFIS368e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.6 bits (82), Expect = 8e-04
Identities = 33/127 (25%), Positives = 45/127 (35%), Gaps = 28/127 (22%)

Query: 391 LNNVKKEVQKLLRTVEFNQKRLSEGLPIQEQ----------SLHSVFTGNPGTGKTTVAR 440
L K+ KL + + +QE L + TG GTGK VAR
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVAR 178

Query: 441 LLGR---------VLFDRGVLPGDEFKFIEVSESDLIATHIGE--TAVQTQA-ILEKAKG 488
L V + +P D + ES+L G A E+A+G
Sbjct: 179 ALHDYGKRRNGPFVAINMAAIPRD------LIESELFGHEKGAFTGAQTRSTGRFEQAEG 232

Query: 489 GILFIDE 495
G LF+DE
Sbjct: 233 GTLFLDE 239



Score = 32.5 bits (74), Expect = 0.008
Identities = 28/155 (18%), Positives = 55/155 (35%), Gaps = 45/155 (29%)

Query: 638 DIDDVL------LQGTQQPEENQKDALEQLQNLIG----IEKVKKQVEQFISLAELNKKR 687
D+ +++ L ++ +D + L+G ++++ + + +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR----------- 155

Query: 688 EEQGAAVSEFSLHSLFLGNPGTGKTTVARIVGKILYQKGIIPQNKFIEVS---------R 738
+ + L + G GTGK VAR L+ G F+ ++
Sbjct: 156 ------LMQTDLTLMITGESGTGKELVAR----ALHDYGKRRNGPFVAINMAAIPRDLIE 205

Query: 739 SDLVAGYVG---QTAIKTQE-VLKSALGGVLFIDE 769
S+L G+ A + A GG LF+DE
Sbjct: 206 SELF-GHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1458PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 23/90 (25%), Positives = 42/90 (46%), Gaps = 16/90 (17%)

Query: 435 LVEELATNAVKYA-----SGGRISLKIDVHDDAIFIISENN----CLQTEKSLGYGLKNM 485
LVE N +K+ GG+I LK + + + EN T++S G GL+N+
Sbjct: 263 LVE----NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNV 318

Query: 486 SNKISVLGG---QMQIFENNKRFRVEITLP 512
++ +L G Q+++ E + + +P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1457HTHFIS489e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.5 bits (113), Expect = 9e-09
Identities = 25/118 (21%), Positives = 47/118 (39%), Gaps = 6/118 (5%)

Query: 2 KILLIDDHKLFSQSIKMILELSENIKKVQLVDNFST-ISEIAFNDYDIILIDINLTSLYQ 60
IL+ DD + L + V++ N +T IA D D+++ D+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP---D 59

Query: 61 TDGLTLAQEIIDNGCSSKVVILTGYSKKMYEHRAKVMGVYGFLDKSMDPDELVKKLEK 118
+ L I V++++ + M +A G Y +L K D EL+ + +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


12smi_1447smi_1437Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1447317-1.643935membrane GTPase TypA
smi_1446320-3.506396transposase, IS1167
smi_1445426-6.093241hypothetical protein
smi_1444223-6.264924ABC transporter ATP-binding protein
smi_1443220-5.288748MurD D-glutamic acid adding enzyme
smi_1442217-3.863414MurG UDP-PP-MurNAc-pentapeptide-UDPGlcNAc GlcNAc
smi_1441118-0.968917cell division protein DivIB
smi_1440-1161.304566orotidine-5'-phosphate decarboxylase
smi_1439-1192.790052orotate phosphoribosyltransferase
smi_1438-2173.040697hypothetical protein
smi_1437-2143.203147hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1447IGASERPTASE355e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 5e-04
Identities = 14/53 (26%), Positives = 24/53 (45%), Gaps = 1/53 (1%)

Query: 351 ADKLIMEAEEKAKQEAKEAEKKQEEERKRLEEEKKKQEEESNRNQTSQRSSRR 403
+ E +E E KE ++EE+ ++E E K QE +Q S + +
Sbjct: 1085 VAQSGSETKETQTTETKETATVEKEEKAKVETE-KTQEVPKVTSQVSPKQEQS 1136



Score = 35.0 bits (80), Expect = 6e-04
Identities = 20/120 (16%), Positives = 46/120 (38%), Gaps = 5/120 (4%)

Query: 2 SKDKKNEGKEILEEFKELSEWQKRNQEYLKKKAE-EEAALAEEKEKERQARMASKSEESD 60
+ + + +E+ +E K + + E + +E +E E KE + E++
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETE 1117

Query: 61 ETGDRESESNPEDPESAKGESEEKVESSEGDKEEEEIEESGSKEKEEQDKNLAKKEKATK 120
+T + ++ P+ + E+ + + + E KE + Q A E+ K
Sbjct: 1118 KTQEVPKVTSQVSPKQEQSETVQP----QAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1438PF05272330.001 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.001
Identities = 15/57 (26%), Positives = 22/57 (38%), Gaps = 6/57 (10%)

Query: 30 KGEVVVIL-GPSGCGKSTLLRCLNGLESIQGGDILLYGQSIVENKKDFHLVRQKIGM 85
K + V+L G G GKSTL+ L GL+ I K + + +
Sbjct: 594 KFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHF-----DIGTGKDSYEQIAGIVAY 645


13smi_1342smi_1306Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1342217-3.273359transcriptional regulator
smi_1341125-6.529954choline-binding protein Cbp9
smi_1340330-5.550120glycerol uptake facilitator
smi_1339230-5.250257hypothetical protein
smi_1338126-5.687656hypothetical protein
smi_1337225-4.211988hypothetical protein
smi_1336125-3.381216transcriptional regulator
smi_1335125-3.239257GMP-synthase
smi_1334125-3.188345site-specific recombinase, phage integrase
smi_1333127-3.291936hypothetical protein
smi_1332127-2.921303polyribonucleotide nucleotidyltransferase
smi_1331026-4.349071hypothetical protein
smi_1330126-5.360203hypothetical protein
smi_1329125-5.946201transcriptional regulator, Tn916
smi_1328024-5.267512hypothetical protein
smi_1327022-4.320340tetracycline resistance protein TetM
smi_1326022-4.092280hypothetical protein
smi_1325123-2.795174cell wall hydrolase, lytic transglycosylase
smi_1324123-5.491881transmembrane amino acid transporter protein
smi_1323026-7.584042ATP/GTP-binding protein
smi_1322127-7.307942hypothetical protein
smi_1321124-6.062223hypothetical protein
smi_1320125-6.315676hypothetical protein
smi_1319024-6.273675hypothetical protein
smi_1318022-4.710326hypothetical protein
smi_1317218-2.651348transcriptional regulator, Tn916
smi_1316119-0.018441hypothetical protein
smi_1315122-1.052314hypothetical protein
smi_13141240.359088hypothetical protein
smi_13132251.214939hypothetical protein
smi_13122231.615374hypothetical protein
smi_1311427-0.363697hypothetical protein
smi_1310526-1.130941hypothetical protein
smi_1309525-0.991858peptidase
smi_1308422-0.960358proton-translocating ATPase, F0 sector, subunit
smi_1307422-1.481720proton-translocating ATPase, F0 sector, subunit
smi_1306321-1.605186proton-translocating ATPase, F0 sector, subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1334TCRTETOQM11010.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 1101 bits (2849), Expect = 0.0
Identities = 603/639 (94%), Positives = 622/639 (97%)

Query: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60
MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120
TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNVCVTNFTESEQWDTVIE 180
IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPN+CVTNFTESEQWDTVIE
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 181 GNDDLLEKYMSGKSLEALELEQEESIRFQNCSLFPLYHGSAKSNIGIDNLIEVITNKFYS 240
GNDDLLEKYMSGKSLEALELEQEESIRF NCSLFP+YHGSAK+NIGIDNLIEVITNKFYS
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240

Query: 241 STHRGPSELCGNVFKIEYTKKRQRLAYIRLYSGVLHLRDSVRVSEKEKIKVTEMYTSING 300
STHRG SELCG VFKIEY++KRQRLAYIRLYSGVLHLRDSVR+SEKEKIK+TEMYTSING
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSING 300

Query: 301 ELCKIDRAYSGEIVILQNEFLKLNSVLGDTKLLPQRKKIENPHPLLQTTVEPSKPEQREM 360
ELCKID+AYSGEIVILQNEFLKLNSVLGDTKLLPQR++IENP PLLQTTVEPSKP+QREM
Sbjct: 301 ELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREM 360

Query: 361 LLDALLEISDSDPLLRYYVDSTTHEIILSFLGKVQMEVISALLQEKYHVEIELKEPTVIY 420
LLDALLEISDSDPLLRYYVDS THEIILSFLGKVQMEV ALLQEKYHVEIE+KEPTVIY
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420

Query: 421 MERPLKNAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480
MERPLK AEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG
Sbjct: 421 MERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 481 IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQAFRKAGTELLEPYL 540
IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQ +KAGTELLEPYL
Sbjct: 481 IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540

Query: 541 SFKVYAPQEYLSRAYNDAPKYCANIVNTQLKNNEVIIIGEIPARCIQDYRNDLTFFTNGL 600
SFK+YAPQEYLSRAY DAPKYCANIV+TQLKNNEVI+ GEIPARCIQ+YR+DLTFFTNG
Sbjct: 541 SFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600

Query: 601 SVCLAELKGYQVTTGEPVCQTRRLNSQIDKVRYMFNKIT 639
SVCL ELKGY VTTGEPVCQ RR NS+IDKVRYMFNKIT
Sbjct: 601 SVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNKIT 639


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1306GPOSANCHOR462e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.8 bits (108), Expect = 2e-06
Identities = 63/351 (17%), Positives = 116/351 (33%), Gaps = 15/351 (4%)

Query: 1084 ALTAKNQEIDDRKDLTQAEKDAAKAEAKKLADAELTKVNAQPDNAETAEAAAAAQKLVND 1143
+ + +AEK A +A +L + L A+K
Sbjct: 166 GAMNFSTADSAKIKTLEAEKAALEARQAEL-EKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 1144 AEDKGVADVTSVYPIAKEEAKKAVADELAKKEKELDDRKDLTTEEKAAAKKEAKDLAKKA 1203
A + E + + K K L+ K +A +K + +
Sbjct: 225 ARKADLE--------KALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFS 276

Query: 1204 TDAINAQPAIADTSDKATAAQQAVDTAKTTGVAEVKAVNPEAVKKNVAKKAIE---DALT 1260
T + A + ++ A +++ + AKK +E L
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336

Query: 1261 AKNNAIDA-RDDLTPEQKTAAKEAAKAKADAAKDAIDKATTDADVDQAKTDGETAVANVT 1319
+N +A R L + + + + +A+ K ++A + D + +
Sbjct: 337 EQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKK 396

Query: 1320 PVAK-KPAKQAIAKALEDKNAEIDKRTDLTEEEKATAKKEAKDKADAQLAEIAKQ-PDVA 1377
V K + ALE N E+++ LTE+EKA + + + +A A ++AKQ ++A
Sbjct: 397 QVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELA 456

Query: 1378 DTPEAAQTAQTAVDAAKKTGVDEVTAVNPTAVTKPEAKKAIDAKLAEQLKT 1428
+ DA P A TKP KA + QL +
Sbjct: 457 KLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPS 507



Score = 41.2 bits (96), Expect = 5e-05
Identities = 83/458 (18%), Positives = 162/458 (35%), Gaps = 49/458 (10%)

Query: 1073 AKQNAKDAIGNALTAKNQEIDDRKDLTQAEKDAAKAEAKKLADAELTKVNAQPDNAETAE 1132
+ +L+ K +I + + + A +A + A T +A+ E +
Sbjct: 96 NAKEKLRKNDKSLSEKASKIQELEA-----RKADLEKALEGAMNFSTADSAKIKTLEAEK 150

Query: 1133 AAAAAQKLVNDAEDKGVADVTSVYPIAKEEAKKAVADELAKKEKELDDRKDLTTEEKAAA 1192
AA AA+K + +G + + A A L ++ L+ R+ + A
Sbjct: 151 AALAARKADLEKALEGAMNFS--------TADSAKIKTLEAEKAALEARQAELEKALEGA 202

Query: 1193 KKEAKDLAKKATDAINAQPAIADTSDKATAAQQAVDTAKTTGVAEVKAVNPEAVKKNVAK 1252
+ + K + A+A A + T A++K + E +
Sbjct: 203 MNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQ 262

Query: 1253 KAIEDALTAKNNAIDARDDLTPEQKTAAKEAAKAKADAAKDAIDKATTDADVDQAKTDGE 1312
+E AL N A + KT E A +A+ A +A+ + D +
Sbjct: 263 AELEKALEGAMNFSTADSA---KIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLD 319

Query: 1313 TAVANVTPVAKKPAKQAIAKALEDKNAEIDKRTDLTEEEKATAKKEAKDKADAQLAEIAK 1372
+ AKK + K E R L A +EAK + +A+ ++ +
Sbjct: 320 ASRE-----AKKQLEAEHQKLEEQNKISEASRQSL--RRDLDASREAKKQLEAEHQKLEE 372

Query: 1373 QPDVADTPEAAQTAQTAVDAAKKTGVDEVTAVNPTAVTKPEAKKAIDAKLAEQLKTIEST 1432
Q +++ + Q+ + +DA++ EAKK ++ L E + +
Sbjct: 373 QNKISEA--SRQSLRRDLDASR------------------EAKKQVEKALEEANSKLAAL 412

Query: 1433 PDATDDEKKVAADAAKALAAKAKAEIDKAGTDADVKALEDEAKAEIEKSLPLVEDKPNAR 1492
+ + +K L K KAE+ +A +A+ KAL+++ + E+ L K +
Sbjct: 413 EKLNKEL-----EESKKLTEKEKAEL-QAKLEAEAKALKEKLAKQAEELAKLRAGKASDS 466

Query: 1493 KAIDEEATAKKAAIDSRTDLTPKAKEDLKAQVDKIAEQ 1530
+ D + K + KA + + Q
Sbjct: 467 QTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQ 504


14smi_1269smi_1248Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1269216-0.608690permease
smi_1268120-3.098610Sec-independent protein translocase tatC
smi_1267224-4.878594Sec-independent protein translocase tatA
smi_1266033-7.268945hypothetical protein
smi_1265235-8.472738hypothetical protein
smi_1263539-10.92052850S ribosomal protein L31 type B
smi_1262538-11.789238hypothetical protein
smi_1261126-8.729746flavodoxin
smi_1260019-5.251309chorismate mutase/prephenate dehydratase
smi_1259020-4.752939CrcB protein
smi_1258118-3.615846CrcB protein
smi_1257-116-0.88161550S ribosomal protein L19
smi_1256112-1.569591*hypothetical protein
smi_1255014-2.921513hypothetical protein
smi_1254-117-3.884528hypothetical protein
smi_1253-118-4.636048site-specific recombinase, phage integrase
smi_1252-217-4.446842transcriptional regulator
smi_1251020-5.498925hypothetical protein
smi_1250-119-4.939629permease
smi_1249-315-3.346766hydrolase
smi_1248-414-3.206823hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1265FLGMOTORFLIM260.041 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 26.0 bits (57), Expect = 0.041
Identities = 18/63 (28%), Positives = 27/63 (42%), Gaps = 8/63 (12%)

Query: 3 PLIQSLTEGQLR-TDIPSFRPGDTVRVHAKVVE-------GNRERIQIFEGVVIARKGAG 54
++ + +L DI R GD +R+H V GNR++ GVV + A
Sbjct: 260 DVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIGNRKKFLCQPGVVGKKIAAQ 319

Query: 55 ISE 57
I E
Sbjct: 320 ILE 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1257TCRTETB290.024 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.024
Identities = 40/214 (18%), Positives = 91/214 (42%), Gaps = 14/214 (6%)

Query: 153 IAGVLMLKFNINSIFFINSASFLLLFLSILISYIPDQEHKENGNDDIKEIFVGFKENFTN 212
+LM + + F S S L +S+L I + ++ + + N
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVD------PGLGKN 255

Query: 213 IKLRISLLFTIVVNIAMLGFNATIIYYLQDQLKLSNSLVGIVYSIAGFGSLIAVTFLSTF 272
I I +L ++ + GF + + Y ++D +LS + +G V G S+I ++
Sbjct: 256 IPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGI 315

Query: 273 L-NKKDTFILMNISMATIPLVIMVSGIVEN----WIFFGICYSI--LSGLITIASVSITT 325
L +++ ++NI + + + + + + ++ I + + LS T+ S +++
Sbjct: 316 LVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSS 375

Query: 326 IQQQESTEYNIGKILSSSFV-IATIFAPFGGILA 358
+Q+ + + +SF+ T A GG+L+
Sbjct: 376 SLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


15smi_1120smi_1088Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1120-315-3.613981peptide chain release factor I
smi_1119023-6.225900protoporphyrinogen oxidase
smi_1118232-8.702822hypothetical protein
smi_1117132-9.309906hypothetical protein
smi_1116032-9.299492serine hydroxymethyltransferase
smi_1115230-9.379495hypothetical protein
smi_1114131-9.589132hypothetical protein
smi_1113032-9.183339hypothetical protein
smi_1112229-8.31792023S rRNA (uracil-5-)-methyltransferase RumA
smi_1111125-6.205421hypothetical protein
smi_1110021-5.348113hypothetical protein
smi_1109019-4.061156hypothetical protein
smi_1108018-3.004241hypothetical protein
smi_1107018-3.206193hypothetical protein
smi_1106021-3.517255hypothetical protein
smi_1105-121-4.903908hypothetical protein
smi_1104025-6.071906aminoglycoside 6'-N-acetyltransferase
smi_1103134-8.822737hypothetical protein
smi_1102034-9.536990hypothetical protein
smi_1101-133-9.562776hypothetical protein
smi_1100-134-11.105468hypothetical protein
smi_1099-134-11.119460hydrolase
smi_1098-135-11.348625hypothetical protein
smi_1097-132-9.478299neopullulanase
smi_1096-132-9.045788hypothetical protein
smi_1095031-9.007378transcriptional regulator
smi_1094129-8.043716hypothetical protein
smi_1093228-7.698883hypothetical protein
smi_1092222-6.175238hypothetical protein
smi_1091219-6.132692hypothetical protein
smi_1090115-5.109094hypothetical protein
smi_10891180.381750hypothetical protein
smi_10882220.545161hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1120THERMOLYSIN280.043 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.4 bits (63), Expect = 0.043
Identities = 23/180 (12%), Positives = 51/180 (28%), Gaps = 14/180 (7%)

Query: 1 MRKKLFLTSAAVLWAVTAVNSVHAATDVQKVIDETYVQPEYVLGSSLSEDQ--------- 51
M K+ L + + + + A +A V +E + P +V GS L
Sbjct: 1 MNKRAMLGAIGLAFGLMAWPFGASAKGKSMVWNEQWKTPSFVSGSLLGRCSQELVYRYLD 60

Query: 52 KNQTLKKLGYNASTDTKELKTMTPDIYSKIMNVANDSS-LQLYSSAKIQKLGDKSPLEVK 110
+ + +LG A + ++ +M + + + + D +
Sbjct: 61 QEKNTFQLGGQARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAHVNDGELSSLS 120

Query: 111 IETPENIT----KVTQDMYRNAAVTLGVEHAKITVAAPIPVTGESALAGIYYSLEANGAK 166
N+ K + A + + V P E + + +
Sbjct: 121 GTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPAAEEGKPTRLVIYPDEETPR 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1112V8PROTEASE340.001 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 33.8 bits (77), Expect = 0.001
Identities = 7/63 (11%), Positives = 19/63 (30%), Gaps = 2/63 (3%)

Query: 38 DEEKSNFGNIGKSFSYGKFNLIGLGDYQKV--WVIDKNTVSAKDNFLPIARIINYDLNIF 95
+E+ + G + K + + V + DK + ++ I + +
Sbjct: 170 NEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEAMQYD 229

Query: 96 TYI 98

Sbjct: 230 LST 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1111SACTRNSFRASE383e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 3e-06
Identities = 24/111 (21%), Positives = 37/111 (33%), Gaps = 10/111 (9%)

Query: 20 PQLTDKEAIDEVKRYTNGKNTAIFTEVENDTIVG-LALCSLRFDYVEGCKYSPVGFLEGI 78
P E D Y + A F + +G + + S Y +E I
Sbjct: 45 PYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYA---------LIEDI 95

Query: 79 IVDEEYRLKDIAKNLCTKCEEWAKNKGCKEFASDCTLTNTNSIRFHLNIGF 129
V ++YR K + L K EWAK + N ++ F+ F
Sbjct: 96 AVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1100PF05211320.002 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 32.3 bits (73), Expect = 0.002
Identities = 23/108 (21%), Positives = 44/108 (40%), Gaps = 4/108 (3%)

Query: 51 TEKDDKTALHLNFYPQQFVGDIKNADVIILAKNPGYSDEYEKLYNNNKDYQQTCLDSLQ- 109
+ ++ AL LN++P + +++L YSD K Y N ++ ++
Sbjct: 32 IIETNEVALKLNYHPASEKVQALDEKILLLRPAFQYSDNIAKEYENK--FKNQTTLKVEQ 89

Query: 110 -LKKVGFHAFELDEGDKLGFTAAKFKFWFEDAGMGDLFKTNDDYKKHV 156
L+ G+ +D DK F+ A+ K + M D K+ +
Sbjct: 90 ILQNQGYKVINVDSSDKDDFSFAQKKEGYLAVAMNGEIVLRPDPKRTI 137


16smi_1046smi_1028Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_10462340.578459hypothetical protein
smi_1045131-0.605813dihydroxyacetone kinase
smi_1044128-0.725685dihydroxyacetone kinase
smi_1043028-0.776146transcriptional regulator
smi_1042-323-1.507439dihydroxyacetone kinase
smi_1041-318-2.612899hypothetical protein
smi_1040-218-3.278328hypothetical protein
smi_1039-218-3.282259phosphoenolpyruvate-protein phosphotransferase,
smi_1038-318-4.169488phosphocarrier protein HPr
smi_1037125-5.775747glutaredoxin-like protein
smi_1036125-5.902393ribonucleoside-diphosphate reductase 2, alpha
smi_1035-125-5.363419ribonucleoside-diphosphate reductase 2, beta
smi_1034-221-3.053667s system repressor
smi_1033-122-1.965965hypothetical protein
smi_1032129-1.2968176-phospho-beta-galactosidase
smi_1031332-0.477964PTS system, lactose-specific IIBC component
smi_1030227-0.747468PTS system, lactose-specific IIA component
smi_1029236-0.400925transcription antiterminator LacT
smi_1028234-0.611815hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1046PHPHTRNFRASE8060.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 806 bits (2083), Expect = 0.0
Identities = 335/573 (58%), Positives = 443/573 (77%), Gaps = 4/573 (0%)

Query: 1 MTEMLKGIAASDGVAVAKAYLLVQPDLSFETISVEDTNAEEARLDVALEASQNELSLIRE 60
M + GIAAS GVA+AKA++ ++P++ E S+ D + E +L ALE S+ EL I++
Sbjct: 1 MHHKITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKD 60

Query: 61 KAVGTLGEEAAQVFDAHLMVLSDPELIGQIKETIRAKKVNAEAGLKEVTDMFITIFEGME 120
+ ++G + A++F AHL+VL DPEL+ IK I +++NAE LKEV+DMF+++FE M
Sbjct: 61 QTEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESM- 119

Query: 121 DNPYMQERAADIRDVTKRVLANLLGKKLPNPASINEEVIVIAHDLTPSDTAQLDKNFVKA 180
DN YM+ERAADIRDV+KRVL +L+G + + A+I EE ++IA DLTPSDTAQL+K FVK
Sbjct: 120 DNEYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKG 179

Query: 181 FVTNIGGRTSHSAIMARTLEIAAVLGTNNITEVVKDGDILAVNGITGEVIISPTDEQAAE 240
F T+IGGRTSHSAIM+R+LEI AV+GT +TE ++ GD++ V+GI G VI++PT+E+
Sbjct: 180 FATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKA 239

Query: 241 FKAAGEAYAKQKAEWALLKDAQTVTADGKHFELAANIGTPKDVEGVNNNGAEAVGLYRTE 300
++ A+ KQK EWA L + T DG H ELAANIGTPKDV+GV NG E +GLYRTE
Sbjct: 240 YEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTE 299

Query: 301 FLYMDSQDFPTEDEQYEAYKAVLEGMNGKPVVVRTMDIGGDKELPYFDMPHEMNPFLGFR 360
FLYMD PTE+EQ+EAYK V++ M+GKPVV+RT+DIGGDKEL Y +P E+NPFLGFR
Sbjct: 300 FLYMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFR 359

Query: 361 ALRISISETGDAMFRTQIRALLRASVHGQLRIMFPMVALLKEFRAAKAVYEEEKANLLAE 420
A+R+ + + +FRTQ+RALLRAS +G L++MFPM+A L+E R AKA+ +EEK LL+E
Sbjct: 360 AIRLCLEKQD--IFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSE 417

Query: 421 GVAVADDIQVGIMIEIPAAAMLADQFAKEVDFMSIGTNDLIQYSMAADRMNEQVSYLYQP 480
GV V+D I+VGIM+EIP+ A+ A+ FAKEVDF SIGTNDLIQY+MAADRMNE+VSYLYQP
Sbjct: 418 GVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQP 477

Query: 481 YNPSILRLINNVIKAAHAEGKWAGMCGEMAGDQKAVPLLVGMGLDEFSMSATSVLRTRSL 540
Y+P+ILRL++ VIKAAH+EGKW GMCGEMAGD+ A+PLL+G+GLDEFSMSATS+L RS
Sbjct: 478 YHPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQ 537

Query: 541 MKKLDTAKMEEYANRALTECSTMEEVLELQKEY 573
+ KL +++ +A +AL T EEV +L K+
Sbjct: 538 LLKLSKEELKPFAQKALM-LDTAEEVEQLVKKT 569


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1031FERRIBNDNGPP290.007 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 28.8 bits (64), Expect = 0.007
Identities = 13/41 (31%), Positives = 17/41 (41%)

Query: 33 NSTQGSADFVDSTVNAITEFRKYKQARAILFDKFGHSFMPA 73
N+ QG +F ST +I YK + FD M A
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDA 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1028BINARYTOXINA280.033 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 28.5 bits (63), Expect = 0.033
Identities = 20/57 (35%), Positives = 27/57 (47%), Gaps = 5/57 (8%)

Query: 171 VLESPHKPTVIKPNNEELSQLLGREVS-EDLDELKEVLQEPLFAGIEWIIVSLGANG 226
ESP K N+E+ E+S E +ELKE +Q+ LF + VSL G
Sbjct: 134 YFESPEKFAF----NKEIRTENQNEISLEKFNELKETIQDKLFKQDGFKDVSLYEPG 186


17smi_1010smi_1002Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_10104250.424922hypothetical protein
smi_10094290.563716farnesyl diphosphate synthase
smi_1008019-0.220416exodeoxyribonuclease VII, small subunit
smi_1007429-1.135161exodeoxyribonuclease VII, large subunit
smi_1006429-1.460073Uridine kinase
smi_1005328-1.586088tRNA pseudouridine synthase B
smi_1004326-1.670179hypothetical protein
smi_1003325-1.596494O-acetylhomoserine sulfhydrylase
smi_1002531-1.540100formate/nitrate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1002GPOSANCHOR572e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 56.6 bits (136), Expect = 2e-09
Identities = 39/242 (16%), Positives = 76/242 (31%), Gaps = 14/242 (5%)

Query: 19 QRFSIRKYHFGAASVLLGTALILG--AAQTTAKAEETATENKTEAVASAPKDDKASENVT 76
+ +S+RK G ASV + ++ T + + DK
Sbjct: 8 RHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENN 67

Query: 77 NVTTP-ALSATTEAAVVENPTLSDEEVAKLAAEASKKDDKASETA-TTEKTEAADKEKAT 134
+ + + A+ ++ EE++ + K D SE A ++ EA +
Sbjct: 68 TLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEK 127

Query: 135 LTAPLTDKKADKAVDEKADKKDEKKAENPITATK-----TVLEQLTSEAEVLNTTASNFA 189
+ A + K +AE A + LE + + +
Sbjct: 128 AL-----EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLE 182

Query: 190 DKKAEDKAGKEAIATAVASAKIQIEASKKTLAAGEITKDELDAQLQRISSAIEAVYDEMK 249
+KA +A + + A+ A A + E K L A+ + A+E +
Sbjct: 183 AEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 242

Query: 250 RA 251

Sbjct: 243 AD 244



Score = 39.7 bits (92), Expect = 3e-04
Identities = 34/196 (17%), Positives = 67/196 (34%), Gaps = 7/196 (3%)

Query: 40 ILGAAQTTAKAEETATENKTEAVASAPKDDKASENVTNVTTPALSATTEAAVVENPTLSD 99
L A + + N + A ++ K +A + L E A+ + S
Sbjct: 152 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSA 211

Query: 100 EEVAKLAAEASKKDDKASETATTEKTEAADKEKATLTAPLTDKKADKAVDEKADKKDEKK 159
+ A +A+ KA E + L +KA + +K +
Sbjct: 212 KIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEG 271

Query: 160 AENPITATKTVLEQLTSEAEVLNTTASNFADKKAEDKAGKEAIATAVASAKIQIEASKKT 219
A N TA ++ L +E L +KA+ + + + S + ++AS++
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEA-------EKADLEHQSQVLNANRQSLRRDLDASREA 324

Query: 220 LAAGEITKDELDAQLQ 235
E +L+ Q +
Sbjct: 325 KKQLEAEHQKLEEQNK 340



Score = 38.9 bits (90), Expect = 5e-04
Identities = 30/195 (15%), Positives = 55/195 (28%), Gaps = 5/195 (2%)

Query: 45 QTTAKAEETATENKTEAVASAPKDDKASENVTNVTTPALSATTEAAVVENPTLSDEEVAK 104
+AK + E A A +KA E N +T + + + + +
Sbjct: 138 ADSAKIKTLEAEKAALAARKA-DLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 196

Query: 105 LAAEASKKDDKASETATTEKTEAADKEKATLTAPLTDKKADKAVDEKADKKDEKKAENPI 164
A E + A A + K + E
Sbjct: 197 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL-EAEK 255

Query: 165 TATKTVLEQLTSEAEVLNTTASNFADKKAEDKAGKEAIATAVASAKIQI---EASKKTLA 221
A + +L E ++ + K +A K A+ A + Q A++++L
Sbjct: 256 AALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLR 315

Query: 222 AGEITKDELDAQLQR 236
E QL+
Sbjct: 316 RDLDASREAKKQLEA 330


18smi_0984smi_0979Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0984421-0.718091DNA primase
smi_0983521-0.355852ABC transporter ATP-binding protein
smi_09824200.309734ABC transporter, membrane-spanning permease
smi_09812150.699328ABC transporter substrate-binding protein
smi_09803180.645930T-box leader
smi_09793180.838604transposase, ISSmi1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0984VACCYTOTOXIN310.009 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.009
Identities = 24/100 (24%), Positives = 46/100 (46%), Gaps = 15/100 (15%)

Query: 257 ANSYFAMVNGG----WFGLGLGNSIEKRGYLPEAHTDFVFSIVIEEFGFVGASLILALVF 312
+N A+ NG F +E R Y + ++ + V++EF G+S ++L
Sbjct: 1177 SNQKVALKNGASSQHLFNASAN--VEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNT 1234

Query: 313 FLILRIILVGIRAKNPFNSMMAIGVGGMMLV--QVFVNIG 350
F + +NP N+ + +GG + + +VF+N+G
Sbjct: 1235 FKVNA-------TRNPLNTHARVMMGGELKLAKEVFLNLG 1267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0980DPTHRIATOXIN300.019 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 29.7 bits (66), Expect = 0.019
Identities = 18/49 (36%), Positives = 26/49 (53%), Gaps = 4/49 (8%)

Query: 59 NESPEHLTNKEVLYQWLKKETEVQLEHP-LPELKQIAD---VFVNGNLA 103
+ESP ++E Q+L++ + LEHP L ELK + VF N A
Sbjct: 263 SESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAGANYA 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0979FLAGELLIN340.003 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.2 bits (78), Expect = 0.003
Identities = 57/378 (15%), Positives = 97/378 (25%), Gaps = 20/378 (5%)

Query: 693 DAAKQAAQDAATKANQAIDAATDNA--GVATAQT-----DGIAAIEAVTPTVAVKAAA-- 743
DAA QA + T + + A+ NA G++ AQT + I ++V+A
Sbjct: 43 DAAGQAIANRFTSNIKGLTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGT 102

Query: 744 ---------KAEVAKKLAEKLTALEGTPNATKEEKDAAKQAAQDAATKANQAIDAATDNA 794
+ E+ ++L E T + Q + I
Sbjct: 103 NSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKI 162

Query: 795 GVATAQTDGIAAIEAVTPTV-AVKAAAKAEVAKKLAEKLTALEGTPNATKEEKDAAKQAA 853
V + DG TV +K++ K +
Sbjct: 163 DVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPT 222

Query: 854 QDAATKANQAIDAATDNAGVATAQTDGIAAIEAVTPTVAVKAAAKAEVAKKLAEKLTALE 913
N A T + D ++ T KA A A K +
Sbjct: 223 VPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKG 282

Query: 914 GTPNATKEEKDAAKQAAQDAATKANQAIDAATDNAGVATAQTDGIAAIEAVTPTVAVKAA 973
T + + + A AG A + + + V +V
Sbjct: 283 VTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQF 342

Query: 974 AKAEVAKKLAEKLTALEGTPNATKEEKDAAKQAAQDAATKA-NQAIDAATDNAGVATAQT 1032
+ K + KL+ LE E K A A + T +
Sbjct: 343 TFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGV 402

Query: 1033 DGIAAIEAVTPTVAVKAA 1050
+ +A +
Sbjct: 403 STLINEDAAAAKKSTANP 420


19smi_0849smi_0835Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0849324-2.7137204-methyl-5(b-hydroxyethyl)-thiazole
smi_0848023-3.621852rod shape determining protein RodA
smi_0847012-0.066503ATP-dependent helicase DinG
smi_08460111.128540protease
smi_0845-1101.163135hypothetical protein
smi_0844-1111.123099histidine kinase
smi_0843-1131.380259response regulator
smi_0842-1141.458731aminopeptidase N
smi_0841-113-0.189462hypothetical protein
smi_0840-216-3.510256hypothetical protein
smi_0839-118-3.309274hypothetical protein
smi_0838-220-4.741512hypothetical protein
smi_0837-123-7.065851short chain dehydrogenase
smi_0836-220-6.282632hypothetical protein
smi_0835-215-3.959357oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0849UREASE250.043 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 25.5 bits (56), Expect = 0.043
Identities = 10/26 (38%), Positives = 13/26 (50%)

Query: 13 INPSIGDEIDAWAFGVEPDLLADLVL 38
INP+I + +E ADLVL
Sbjct: 411 INPAIAHGLSHEIGSLEVGKRADLVL 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0845DHBDHDRGNASE922e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 91.7 bits (227), Expect = 2e-24
Identities = 63/252 (25%), Positives = 105/252 (41%), Gaps = 24/252 (9%)

Query: 3 RRVLITGVSSGIGLAQARLFLEKSYQVYGVDQGENPLL-----EGDFHFLQRDLTLDL-- 55
+ ITG + GIG A AR + + VD L D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 56 -EPIFDWCPR-------VDVLCNTAGVLDDYKPLLEQTAQEIQEIFEINYMTPVELTRYY 107
I + R +D+L N AGVL + + +E + F +N +R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 108 LTQMLENKKGTIINMCSIASSLAGGGGHAYTSSKHALAGFTKQLAIDYAEAGIQIFGIAP 167
M++ + G+I+ + S + + AY SSK A FTK L ++ AE I+ ++P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 168 GAVKTAMT--------AADFEPGGLADWVASETPIKRWIEPEEVAEVSLFLASGKVSAMQ 219
G+ +T M A+ G + + P+K+ +P ++A+ LFL SG+ +
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 220 GQILTIDGGWSL 231
L +DGG +L
Sbjct: 248 MHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0839RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 5e-07
Identities = 27/154 (17%), Positives = 53/154 (34%), Gaps = 8/154 (5%)

Query: 47 LVVAKEGSVASSVLLSGTVTAKNEQYVYFDASKGDLDEILVSVGDKVSEGQALVKYSSSE 106
+++ G V +G +T + EI+V G+ V +G L+K ++
Sbjct: 72 FILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALG 131

Query: 107 AQAAYDSASRAVAKADRHINELNQARNEAASAPAP--QLPASAV-GEGVAAQAPAPVSGN 163
A+A ++ +A L Q R + S +LP + E
Sbjct: 132 AEADTLKTQSSLLQA-----RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 164 SVSSIDAQLGDARDARADAAAQLSKAQSQLDAMT 197
S I Q ++ + L K +++ +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVL 220


20smi_0815smi_0771Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0815321-0.649626hypothetical protein
smi_0814527-1.395518peptidyl-prolyl cis-trans isomerase
smi_0813634-2.562873ABC transporter ATP-binding protein
smi_0812533-3.279120VanZ-related protein
smi_0811333-3.46517123S rRNA (adenine(2503)-C(2))-methyltransferase
smi_0810131-2.884379hypothetical protein
smi_0809130-4.289602superoxide dismutase
smi_0808134-5.602535DNA polymerase III, delta subunit
smi_0807134-6.262555dihydroorotate dehydrogenase
smi_0806134-5.405487glutamate dehydrogenase
smi_0805135-6.887442transposase, ISSsu4
smi_0804237-9.336770transposase, ISSsu4, authentic frameshift
smi_0803136-8.978302transposase, ISSsu4, authentic frameshift
smi_0802031-7.107284hypothetical protein
smi_0801032-6.871439transposase, ISSsu4, authentic frameshift
smi_0800029-6.742910transposase, ISSsu4, authentic frameshift
smi_0799231-5.592841ATP synthase, subunit D
smi_0798330-5.306820V-type H+-ATPase, subunit B
smi_0797330-5.077358V-type H+-ATPase, subunit A
smi_0796228-5.889335V-type H+-ATPase, subunit F
smi_0795227-5.358423V-type H+-ATPase, subunit C
smi_0794129-6.212384V-type H+-ATPase, subunit E
smi_0793029-6.696806V-type H+-ATPase, subunit K
smi_0792128-5.147210V-type H+-ATPase, subunit I
smi_0791430-4.970973hypothetical protein
smi_0790428-4.759206transcriptional regulator
smi_0789228-4.292278oxidoreductase, GFO/IDH/MOCA family
smi_0788128-2.716651hypothetical protein
smi_0787220-0.044194sodium:solute symporter family protein
smi_07862282.100305N-acetylneuraminate lyase
smi_07853344.215659N-acetylmannosamine-6-phosphate epimerase
smi_07843374.710840transcriptional regulator
smi_07833354.407620metal-dependent CAAX amino terminal membrane
smi_07821344.045445hypothetical protein
smi_07811293.489720hypothetical protein
smi_07800242.993975hypothetical protein
smi_0779-1202.966106methyl transferase
smi_0778-1152.480649hypothetical protein
smi_07770152.223458hypothetical protein
smi_0776-1130.52311350S ribosomal protein L7/L12
smi_0775-214-1.17646350S ribosomal protein L10
smi_0774-112-2.115848transposase, ISSmi1
smi_0773-114-3.119019transposase, ISSmi1
smi_0772117-3.734184transposase, ISSmi1
smi_0771016-3.010651transposase, ISSmi1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0810IGASERPTASE340.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.3 bits (78), Expect = 0.003
Identities = 42/285 (14%), Positives = 81/285 (28%), Gaps = 16/285 (5%)

Query: 679 PTTATSKDAAKQEITNAAEAKKSAIDGQAGLTSEEKAAAKAAVDTEAEKAKAAIDKATDQ 738
D N A+ + + + +++K + +Q
Sbjct: 999 TPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTV--EKNEQ 1056

Query: 739 AGVDSAKAAGKQAIENVPTNGTNATTAKEAAKKAIIDAANKKKAEIDANPALTKEEKEAA 798
++ + A E N T + A + TKE
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNE---------VAQSGSETKETQTTETKETATVE 1107

Query: 799 KKAVDAEAEKAKQEIDKATDQAGIDKAKNDGLTAIENVPTNGTNATTAKEAAKKAITDAA 858
K+ + QE+ K T Q + +++ + KE + T A
Sbjct: 1108 KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 859 NKKKA-ATDKATDQAGVDSAKVAGKKVIENVPTTATSKDAAKQEITNAAEAKKSAIDGQA 917
++ A T +Q +S V + P T A Q N+ + K +
Sbjct: 1168 TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTP--ATTQPTVNSESSNKPKNRHRR 1225

Query: 918 GLTAEEKAAAKAAVDETNSRKVATELPNTGTTDSTVAVLAAVASA 962
+ + A + VA L + +T++ + A A A
Sbjct: 1226 SVRSVPHNVEPATTSSNDRSTVA--LCDLTSTNTNAVLSDARAKA 1268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0778UREASE379e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.0 bits (86), Expect = 9e-05
Identities = 23/65 (35%), Positives = 32/65 (49%), Gaps = 9/65 (13%)

Query: 219 RTAALLQKMK---------SGDASQFPIETALKALTIEGAKVLGMDEQIGSLEVGKQADF 269
RT KMK +GD F ++ + TI A G+ +IGSLEVGK+AD
Sbjct: 375 RTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSHEIGSLEVGKRADL 434

Query: 270 LVIQP 274
++ P
Sbjct: 435 VLWNP 439


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0772BACINVASINB280.033 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 28.2 bits (62), Expect = 0.033
Identities = 15/63 (23%), Positives = 35/63 (55%), Gaps = 4/63 (6%)

Query: 87 EDLSDLPDMEELAQMSPDEFIKTLEKSIADKTKDDIEAIQSLEQVEAKEEEQEQAEQEAE 146
++LS++ + L M FI+ + K+ + ++D+ +L++ E E++ AE + E
Sbjct: 248 DNLSNVARLTMLMAM----FIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEFQEE 303

Query: 147 SKK 149
++K
Sbjct: 304 TRK 306


21smi_0760smi_0733Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_07600183.879508glycosyl transferase
smi_0759-1183.478067glycosyl transferase
smi_0758-1162.630077phosphorylcholine transferase LicD3
smi_07572181.240516transcriptional regulator
smi_07562180.737224prephenate dehydratase
smi_07550181.145738shikimate kinase
smi_0754-2161.5819133-phosphoshikimate 1-carboxyvinyltransferase
smi_0753-2181.756355hypothetical protein
smi_0752-2172.487738hypothetical protein
smi_0751-2193.215075prephenate dehydrogenase
smi_0750-2223.313421chorismate synthase,
smi_0749-1233.1339773-dehydroquinate synthase
smi_07480303.083543shikimate 5-dehydrogenase
smi_0747-2273.3363123-dehydroquinase
smi_0746-2232.450330hypothetical protein
smi_0745-1232.966359phosphoglycerol transferase and related
smi_0744-1223.184171hypothetical protein
smi_0743-2162.723737ABC transporter ATP-binding protein
smi_0742-1152.103088alpha-amylase
smi_07410161.922665alanyl-tRNA synthetase
smi_07400182.225096hypothetical protein
smi_07390182.248942transposase, ISSmi1
smi_0738-2182.191880ABC transporter, spermidine/putrescine-binding
smi_0737-1223.007476ABC-transporter, spermidine/putrescine transport
smi_07360193.669271ABC-transporter, spermidine/putrescine transport
smi_07352183.803411ABC transporter, ATP-binding
smi_07343173.943809UDP-N-acetylenolpyruvoylglucosamine reductase
smi_07332193.224332hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0757UREASE290.035 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.9 bits (65), Expect = 0.035
Identities = 20/83 (24%), Positives = 31/83 (37%), Gaps = 16/83 (19%)

Query: 122 IHFV---QIPTSLTAQVDSSIGGKTGVN--------TPFAKNMVGTFAQPDGVLIDPLVL 170
IHF+ QI +L + + +GG TG TP ++ D P+ L
Sbjct: 137 IHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEAADAF---PMNL 193

Query: 171 GTLGK--RELIEGMGEVIKYGLI 191
GK L + E++ G
Sbjct: 194 AFAGKGNASLPGALVEMVLGGAT 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0746MYCMG045514e-09 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 50.9 bits (121), Expect = 4e-09
Identities = 78/331 (23%), Positives = 136/331 (41%), Gaps = 65/331 (19%)

Query: 25 LDSKINSRDSQKLVIYNWGDYIDPELLEQFTEETGIQVQYETFDSNEAMYTKIKQGGTTY 84
L S ++S S V+ N+ YI P LLE+ E+ + + T+ SNE + TY
Sbjct: 16 LSSILSSCGSTTFVLANFESYISPLLLERVQEKH--PLTFLTYPSNEKLINGF--ANNTY 71

Query: 85 DIAIPSEYMINKMKDEDLLVPLDYSK-----------------------IEGLENIGPEF 121
+A+ S Y ++++ + DLL P+D+S+ I+ ++ I +
Sbjct: 72 SVAVASTYAVSELIERDLLSPIDWSQFNLKKSSSSSDKVNNASDAKDLFIDSIKEISQQT 131

Query: 122 LNQSFDPGNKFSIPYFWGTLGIVYNETMVEEAPEH---WDDLWKPEYK-------NSIML 171
+ + +++PYF L VY + E + W D+ K K N ++
Sbjct: 132 KDSKNNELLHWAVPYFLQNLVFVYRGEKISELEQENVSWTDVIKAIVKHKDRFNDNRLVF 191

Query: 172 FDGAREVLGLG---------------LNSLGYSLNSKDS-QQLEETVDKLYKLTPNIKA- 214
D AR + L + +GY N +S Q+L T L + N +
Sbjct: 192 IDDARTIFSLANIVNTNNNSADVNPKEDGIGYFTNVYESFQRLGLTKSNLDSIFVNSDSN 251

Query: 215 IVADEM-----KGYMIQNNAAIGVTFSGEASQMLEKNE----NLRYVVPTEASNLWFDNM 265
IV +E+ +G ++ N A+ G+ L + + N ++V + S + D +
Sbjct: 252 IVINELASGRRQGGIVYNGDAVYAALGGDLRDELSEEQIPDGNNFHIVQPKISPVALDLL 311

Query: 266 VIPKTVKN-QDAAYAFINFMLKPENALKNAE 295
VI K N Q A+ I F L + A + E
Sbjct: 312 VINKQQSNFQKEAHEII-FDLALDGADQTKE 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0738ADHESNFAMILY310.005 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 31.0 bits (70), Expect = 0.005
Identities = 17/45 (37%), Positives = 21/45 (46%), Gaps = 5/45 (11%)

Query: 4 KKWIFVLCSFLATFFLVACQSGSNGSQSAVEAIKQKGKLVVATSP 48
KK +L FL+ LVAC SG + S QK K+V S
Sbjct: 2 KKLGTLLVLFLSAIILVACASGKKDTTS-----GQKLKVVATNSI 41


22smi_0710smi_0694Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0710121-5.051344hypothetical protein
smi_0709219-4.698536prolipoprotein diacylglycerol transferase
smi_0708218-4.663615Hpr(Ser) kinase/phosphatase
smi_0707317-4.56170630S ribosomal protein S21
smi_0706319-4.834799N-acetylglucosamine-6-phosphate isomerase
smi_0705318-4.651990S-adenosylmethionine:tRNA
smi_0703122-1.327614ribonuclease BN
smi_0702023-1.936325Na+/H+ antiporter
smi_0701224-5.665095hypothetical protein
smi_0700026-6.164653hypothetical protein
smi_0699-115-3.262285hypothetical protein
smi_0698-114-2.631693hypothetical protein
smi_0697-211-1.023950hypothetical protein
smi_0696-110-0.742847cell wall-associated serine proteinase
smi_06951141.466468*ABC transporter permease
smi_06942151.162095hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0705SUBTILISIN935e-22 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 93.0 bits (231), Expect = 5e-22
Identities = 49/233 (21%), Positives = 85/233 (36%), Gaps = 57/233 (24%)

Query: 214 LKAINAQF-GKNFDGRGMVISNIDTGTDYRHKAMRIDDDAKGSMRFKKEDLKGTDKNFWL 272
++ I A GRG+ ++ +DTG D H DLK
Sbjct: 26 VEMIQAPAVWNQTRGRGVKVAVLDTGCDADH-----------------PDLKA------- 61

Query: 273 SDKIPHAFNYYNGGKITVEKADDGSDYFDPHGMHIAGILAGNDTEKDIKNFNGIDGIAPN 332
+I N+ + + E D + HG H+AG +A + N NG+ G+AP
Sbjct: 62 --RIIGGRNFTDDDEGDPEIFKDY----NGHGTHVAGTIAATE------NENGVVGVAPE 109

Query: 333 AQIFSYKMYSDAGSGFAGDETMFHAIEDSIKHNVDVVSVSSGFTGTGLVGEKYWQAIRAL 392
A + K+ + GSG + I +I+ VD++S+S G +A++
Sbjct: 110 ADLLIIKVLNKQGSGQYDW--IIQGIYYAIEQKVDIISMSLGGPEDVPELH---EAVKKA 164

Query: 393 RKAGIPMVVATGNYATSASSSSWDLVANNHLKMTDTGNVTRTAAHEDAIAVAS 445
+ I ++ A GN T + + + I+V +
Sbjct: 165 VASQILVMCAAGNEGDGDDR---------------TDELGYPGCYNEVISVGA 202



Score = 59.9 bits (145), Expect = 3e-11
Identities = 38/139 (27%), Positives = 56/139 (40%), Gaps = 32/139 (23%)

Query: 665 PDVSAPGKNIKSTLNVINGKSTYGYMSGTSMATPIVAASTVLIRPKLKEMLERPVLKNLE 724
D+ APG++I ST+ Y SGTSMATP VA + LI+ ER
Sbjct: 219 VDLVAPGEDILSTVP----GGKYATFSGTSMATPHVAGALALIKQLANASFER------- 267

Query: 725 GDDKIDLTSLT-KIALQNTARPMMDATSWKEKSQYFASPRQQGAGLINVANALRNEVVAT 783
DLT L P+ + SP+ +G GL+ + E+
Sbjct: 268 -----DLTEPELYAQLIKRTIPLGN------------SPKMEGNGLLYLTAV--EELSRI 308

Query: 784 FKNKDSKGLVNSYGSISLK 802
F + G++ S S+ +K
Sbjct: 309 FDTQRVAGIL-STASLKVK 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0698TOXICSSTOXIN250.035 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 25.4 bits (55), Expect = 0.035
Identities = 6/25 (24%), Positives = 13/25 (52%)

Query: 45 LLNLDIKVRRLLVKNYSVFYRFDKD 69
+ LD ++R L + + ++ DK
Sbjct: 166 ISTLDFEIRHQLTQIHGLYRSSDKT 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0694PF06580310.009 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.009
Identities = 22/102 (21%), Positives = 43/102 (42%), Gaps = 15/102 (14%)

Query: 288 ILSLSSV--QELRDDREEIDLLQMTQSLVKDYTLLAKKRELQIDNSLTYQ----QAYLNP 341
+ SLS + LR L ++V Y LA + ++ L ++ A ++
Sbjct: 197 LTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQ---FEDRLQFENQINPAIMDV 253

Query: 342 SVMKLILSNLISNAIKHSVL----GGLVRIG--EREGELFIE 377
V +++ L+ N IKH + GG + + + G + +E
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295


23smi_0669smi_0638Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_06690143.519478phosphomethylpyrimidine kinase
smi_06680184.294587tRNA pseudouridine synthase A
smi_06670254.751553major facilitator superfamily
smi_06660244.421457hypothetical protein
smi_06650234.035315PhnA protein, required for expression of the
smi_0664-1203.064741cytidylate kinase
smi_06630213.062048hypothetical protein
smi_06620192.476938ferredoxin
smi_0661-2141.655356glycosyl transferase
smi_06600161.423613UDP-glucose 4-epimerase
smi_0659-1160.843866hypothetical protein
smi_06580242.981652hypothetical protein
smi_06570211.265585hypothetical protein
smi_0656220-1.348647Bcl-2 family protein
smi_0655014-1.276810P-type ATPase-metal cation transport
smi_0654011-0.823792transposase, ISSmi1
smi_0653-113-0.2204081-acylglycerol-3-phosphate O-acyltransferase
smi_0652-117-1.381947esterase of alpha/beta hydrolase superfamily
smi_06510190.068352iron(III) dicitrate-binding lipoprotein
smi_06502252.565777ABC-transporter, Fe3+-siderophore, ATP-binding
smi_06493262.859146ABC-transporter Fe3+-siderophore, permease
smi_06484252.86073330S ribosomal protein S15
smi_06471233.754021hypothetical protein
smi_06460223.433048hypothetical protein
smi_06450243.481573Threonyl-tRNA synthetase, threonine-tRNA ligase
smi_0644-2183.516139hypothetical protein
smi_0643-2131.820450ABC transporter ATP-binding protein
smi_0642011-0.051182histidine kinase
smi_0641018-1.735400response regulator
smi_0640117-1.284393hypothetical protein
smi_0639018-2.116779hypothetical protein
smi_0638023-3.690473hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0668NUCEPIMERASE1832e-57 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 183 bits (465), Expect = 2e-57
Identities = 87/350 (24%), Positives = 149/350 (42%), Gaps = 48/350 (13%)

Query: 4 KILVTGGAGFIGTHTVIELIQAGHQVVVVDNLVNSNRKSLEV--VERITGVEIPFYEADI 61
K LVTG AGFIG H L++AGHQVV +DNL + SL+ +E + F++ D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 62 RDTDTLRDIFKQEEPTGVIHFAGLKAVGESTRIPLAYYDNNIAGTVSLLKAMEEANCKNI 121
D + + D+F V AV S P AY D+N+ G +++L+ +++
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 122 IFSSSATVYGDPHTVPILE----DFPLSVTNPYGRTKLMLEEI---LTDIYKADSEWNVV 174
+++SS++VYG +P D P+S Y TK E + + +Y
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVS---LYAATKKANELMAHTYSHLYGLP----AT 174

Query: 175 LLRYFNPIGAHESGDLGENPNGIPNNLLPYVTQVAVGKLEQVQVFGDDYDTEDGTGVRDY 234
LR+F G P G P+ L T+ A+ + + + V+ G RD+
Sbjct: 175 GLRFFTVYG----------PWGRPDMALFKFTK-AMLEGKSIDVYN------YGKMKRDF 217

Query: 235 IHVVDLAKGHVAALKKIQKGSG---------------LNVYNLGTGKGYSVLEIIQNMEK 279
++ D+A+ + I VYN+G +++ IQ +E
Sbjct: 218 TYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALED 277

Query: 280 AVGRPIPYRIVDRRPGDIAACYSDPAKAKGELGWEAELGITQMCEDAWRW 329
A+G ++ +PGD+ +D +G+ E + ++ W
Sbjct: 278 ALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0659ADHESNFAMILY310.008 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 30.6 bits (69), Expect = 0.008
Identities = 11/68 (16%), Positives = 25/68 (36%), Gaps = 7/68 (10%)

Query: 1 MKKILSILLVTVATLTMAACGNTTTEKATTQSSTETSQKASTETTYPLTVKTYDAKGNEV 60
MKK+ ++L++ ++ + + AC + T + QK T + +
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGK-------KDTTSGQKLKVVATNSIIADITKNIAGDK 53

Query: 61 EQVFDKAP 68
+ P
Sbjct: 54 IDLHSIVP 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0649HTHFIS794e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.7 bits (194), Expect = 4e-19
Identities = 31/142 (21%), Positives = 66/142 (46%), Gaps = 3/142 (2%)

Query: 3 KILLVEDDQVIRQQVGKMLSEWGFEVVLVEDFMEVLSLFVQSEPHLVLMDIGLPLFNGYH 62
IL+ +DD IR + + LS G++V + + + + LV+ D+ +P N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 WCQEIRKI-SKVPIMFLSSRDQAMDIVMAINMGADDFVTKPFDQQVLLAKVQGLL--RRS 119
I+K +P++ +S+++ M + A GA D++ KPFD L+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 120 YEFGRDESLLEYAGVILNTKSM 141
++ + ++ + +M
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAM 146


24smi_0599smi_0586Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0599-1183.101904hypothetical protein
smi_0598-1203.586812hypothetical protein
smi_05970193.804014ABC transporter-sugar transport,
smi_0596-1194.215648ABC transporter permease-sugar transport,
smi_0594-2163.844625ABC transporter, sugar transporter, sugar
smi_0593-2183.008093N-acetylmannosamine-6-phosphate 2-epimerase 2
smi_0592-1191.848860sialidase A (neuraminidase A)
smi_05910202.122443acetyl xylan esterase
smi_05901192.274947hypothetical protein
smi_05891192.426495hypothetical protein
smi_0588-1172.177011transposase
smi_0587-2162.450016transposase
smi_0586-2173.059058branch migration of Holliday junctions,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0593ALARACEMASE354e-123 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 354 bits (910), Expect = e-123
Identities = 130/365 (35%), Positives = 185/365 (50%), Gaps = 17/365 (4%)

Query: 7 RPTKALIHLGAIRQNIQQMGAHIPQGTLKWAVVKANAYGHGAVTVAKAIQDDVDGFCVSN 66
RP +A + L A++QN+ + + W+VVKANAYGHG + AI DGF + N
Sbjct: 3 RPIQASLDLQALKQNLSIVRQAATHARV-WSVVKANAYGHGIERIWSAI-GATDGFALLN 60

Query: 67 IDEAIELRQAGLSKKILIL-GVSEIEAVSLAKEYDITLTVAGLEWIQALLDKEADLTGLT 125
++EAI LR+ G IL+L G + + + ++ +T V ++AL + L
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAP-LD 119

Query: 126 VHLKIDSGMGRIGFREAGEAEQAQDLLQQHGAYVEGIFTHFATADEESDTYFNTQLERFK 185
++LK++SGM R+GF+ Q L + +HFA A+ + + R +
Sbjct: 120 IYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPD--GISGAMARIE 177

Query: 186 TILESMKGLPELVHASNSATTLWHAETIFNAVRMGDAMYGLNPSGEVLDL-PYGLTPALT 244
E GL SNSA TLWH E F+ VR G +YG +PSG+ D+ GL P +T
Sbjct: 178 QAAE---GLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMT 234

Query: 245 LESALVHVKTVPVGACMGYGATYQADSEQVIATVPIGYADGWTRDMQN-FSVLVDGQACP 303
L S ++ V+T+ G +GYG Y A EQ I V GYADG+ R VLVDG
Sbjct: 235 LSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTM 294

Query: 304 IVGRVSMDQITIRLPKL--YPLGTKVTLIGSNGDKEITATQVATYRGTINYEVVCLLSDR 361
VG VSMD + + L +GT V L G KEI VA GT+ YE++C L+ R
Sbjct: 295 TVGTVSMDMLAVDLTPCPQAGIGTPVELWG----KEIKIDDVAAAAGTVGYELMCALALR 350

Query: 362 IPREY 366
+P
Sbjct: 351 VPVVT 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0589SECA10550.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1055 bits (2729), Expect = 0.0
Identities = 391/905 (43%), Positives = 560/905 (61%), Gaps = 73/905 (8%)

Query: 1 MANILKTIIENDKG-EIRRLEKMADKVFKYEDQMAALTDDQLKAKTVEFKERYQNGESLD 59
+ +L + + +RR+ K+ + + E +M L+D++LK KT EF+ R + GE L+
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 60 SLLYEAFAVVREGAKRVLGLFPYKVQVMGGIVLHHGDVPEMRTGEGKTLTATMPVYLNAL 119
+L+ EAFAVVRE +KRV G+ + VQ++GG+VL+ + EMRTGEGKTLTAT+P YLNAL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 120 SGKGVHVVTVNEYLSERDATEMGELYSWLGLSVGINLAAKSPMEKKEAYECDITYSTNSE 179
+GKGVHVVTVN+YL++RDA L+ +LGL+VGINL K+EAY DITY TN+E
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 180 IGFDYLRDNMVVRAENMVQRPLNYALVDEVDSILIDEARTPLIVSGANAVETSQLYHMAD 239
GFDYLRDNM E VQR L+YALVDEVDSILIDEARTPLI+SG + ++Y +
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSS-EMYKRVN 240

Query: 240 HYVKSLDKD------------DYIIDVQSKTIGLSDSGIDKAESYF-------KLENLYD 280
+ L + + +D +S+ + L++ G+ E + E+LY
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 281 IENVALTHFIDNALRANYIMLLDIDYVVSEEQEILIVDQFTGRTMEGRRYSDGLHQAIEA 340
N+ L H + ALRA+ + D+DY+V ++ E++IVD+ TGRTM+GRR+SDGLHQA+EA
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIV-KDGEVIIVDEHTGRTMQGRRWSDGLHQAVEA 359

Query: 341 KEGVPIQDETKTSASITYQNLFRMYKKLSGMTGTGKTEEEEFREIYNIRVIPIPTNRPVQ 400
KEGV IQ+E +T ASIT+QN FR+Y+KL+GMTGT TE EF IY + + +PTNRP+
Sbjct: 360 KEGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMI 419

Query: 401 RIDHSDLLYASIEAKFKAVVEDVKARYQKGQPVLVGTVAVETSDYISKKLVAAGVPHEVL 460
R D DL+Y + K +A++ED+K R KGQPVLVGT+++E S+ +S +L AG+ H VL
Sbjct: 420 RKDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVL 479

Query: 461 NAKNHYKEAQIIMNAGQRGAVTIATNMAGRGTDIKLG----------------------- 497
NAK H EA I+ AG AVTIATNMAGRGTDI LG
Sbjct: 480 NAKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKA 539

Query: 498 ------EGVRELGGLCVIGTERHESRRIDNQLRGRSGRQGDPGESQFYLSLEDDLMKRFG 551
+ V E GGL +IGTERHESRRIDNQLRGRSGRQGD G S+FYLS+ED LM+ F
Sbjct: 540 DWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFA 599

Query: 552 SERLKGIFERLNMSE-EAIESRMLTRQVEAAQKRVEGNNYDTRKQVLQYDDVMREQREII 610
S+R+ G+ +L M EAIE +T+ + AQ++VE N+D RKQ+L+YDDV +QR I
Sbjct: 600 SDRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAI 659

Query: 611 YAQRYDVITADRDLAPEIQAMIKRTIERVVDGHARAKQDEK---LEAILNFAKFNLLPED 667
Y+QR +++ D++ I ++ + + +D + + E+ + + K N D
Sbjct: 660 YSQRNELLDVS-DVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLK-NDFDLD 717

Query: 668 SITMDDLSGLPD---KTIKEELFQRALQVYDSQVSKLRDEDAVKEFQKVLILRVVDNKWT 724
+ L P+ +T++E + ++++VY + + + ++ F+K ++L+ +D+ W
Sbjct: 718 LPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVG-AEMMRHFEKGVMLQTLDSLWK 776

Query: 725 DHIDALDQLRNAVGLRGYAQNNPVVEYQAEGFRMFNDMIGSIEFDVTRLMMKAQIH---- 780
+H+ A+D LR + LRGYAQ +P EY+ E F MF M+ S++++V + K Q+
Sbjct: 777 EHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEE 836

Query: 781 -----EQERPQAEHHISTTATRNIAAHQA---NIPEDLDLSQIGRNELCPCGSGKKFKNC 832
+Q R +AE + A + ++GRN+ CPCGSGKK+K C
Sbjct: 837 VEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQC 896

Query: 833 HGKRQ 837
HG+ Q
Sbjct: 897 HGRLQ 901


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0586TCRTETOQM389e-05 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 37.5 bits (87), Expect = 9e-05
Identities = 30/139 (21%), Positives = 53/139 (38%), Gaps = 25/139 (17%)

Query: 1 MALPTIAIVGRPNVGKSTLFNRI-----AGERISIV------------EDVEGVTRDRIY 43
M + I ++ + GK+TL + A + V E G+T
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 44 ATGEWLNRSFSMIDTGGIDDVDAPFMEQIKHQAEIAMEEADVIVFVVSGKEGITDADEYV 103
+ +W N ++IDT G D A + ++ D + ++S K+G+ +
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLA--------EVYRSLSVLDGAILLISAKDGVQAQTRIL 112

Query: 104 ARKLYKTHKPVILAVNKVD 122
L K P I +NK+D
Sbjct: 113 FHALRKMGIPTIFFINKID 131


25smi_0575smi_0539Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0575-1284.167568chromosome replication initiation
smi_05740314.039224transcriptional regulator
smi_05731264.426997GntR family transcriptional regulator
smi_05721254.364787ABC transporter, ATP binding domain, ABC-type
smi_05711182.424394ABC transporter permease
smi_0570-2143.031388ABC exporter ATP binding/membrane-spanning
smi_05690142.654457LrgB family protein
smi_0568-1143.559959effector of murein hydrolase, LrgA family
smi_0567-2133.313414transcriptional regulator
smi_05660163.094759hypothetical protein
smi_05651214.977967thioredoxin
smi_05640204.2247467-cyano-7-deazaguanine synthase QueC
smi_0563-1183.6514146-pyruvoyl-tetrahydropterin synthase
smi_0562-1202.8289027-cyano-7-deazaguanosine (preQ0) biosynthesis
smi_0561018-0.7568387-cyano-7-deazaguanine reductase
smi_0560017-0.722363aquaporin Z-water channel protein
smi_0558-118-4.607675acetoin reductase
smi_0557-217-4.706852hypothetical protein
smi_0556-222-6.632028oligoendopeptidase F
smi_0555-125-7.20963416S rRNA (uracil(1498)-N(3))-methyltransferase
smi_0554-132-4.182156ribosomal protein methyltransferase
smi_0553323-2.7302307,8-dihydro-8-oxoguanine-triphosphatase
smi_05521210.136633hypothetical protein
smi_05510172.417525chromosome segregation helicase, ATPase, AAA
smi_05501153.603304*transcriptional regulator
smi_05490164.041745NADPH:quinone reductase and related Zn-dependent
smi_05480154.217338hypothetical protein
smi_05470154.112622transcriptional activator, Mga-like regulatory
smi_0546-1143.759992hypothetical protein
smi_0545-1162.817723hypothetical protein
smi_0544-2142.730638hypothetical protein
smi_0543-2152.781774general stress protein 24
smi_0542-2182.878904hypothetical protein
smi_0541-2213.284686acetyltransferase, GNAT family
smi_0540-2223.296445type II secretory pathway, prepilin signal
smi_0539-2223.077725tryptophan synthase alpha chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0567DHBDHDRGNASE1335e-40 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 133 bits (335), Expect = 5e-40
Identities = 81/251 (32%), Positives = 129/251 (51%), Gaps = 6/251 (2%)

Query: 3 KVAIVTGAGQGIGFAIAKRLVQDGFKVGVLDYNPETAEKAVAELSAE--NAFAVVADVSK 60
K+A +TGA QGIG A+A+ L G + +DYNPE EK V+ L AE +A A ADV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 61 QAEVAQAFQKVVDHFGDLNVVVNNAGVAPTTPLDTITEEQFTRTFGINVGGVIWGSQAAQ 120
A + + ++ G ++++VN AGV + ++++E++ TF +N GV S++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 121 AQFKALGHGGKIINATSQAGVVGNPNLTVYGGTKFAVRGITQTLARDLADSGITVNAYAP 180
G I+ S V ++ Y +K A T+ L +LA+ I N +P
Sbjct: 129 KYMMD-RRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 181 GIVKTPMMYDIAHEVGKNAGKDDEWG-MQTFAKDITLKRLSEPEDVAAAVSFLAGPDSNY 239
G +T M + + +N + G ++TF I LK+L++P D+A AV FL + +
Sbjct: 188 GSTETDMQWSLW--ADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 240 ITGQTIIVDGG 250
IT + VDGG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0560PF05272300.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.015
Identities = 12/56 (21%), Positives = 25/56 (44%), Gaps = 7/56 (12%)

Query: 42 MILYGPPGIGKTSIASAIAGTTKY--AFRTFNATVDSKKRLQ-----EIAEEAKFS 90
++L G GIGK+++ + + G + DS +++ E++E F
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFR 654


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0555PF050436110.0 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 611 bits (1577), Expect = 0.0
Identities = 467/492 (94%), Positives = 482/492 (97%)

Query: 1 MRDLLSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHS 60
MRDLLSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHS
Sbjct: 1 MRDLLSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHS 60

Query: 61 STNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGCDADSICKEFYISSSSLYRI 120
STNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGC A+SICKEFYISSSSLYRI
Sbjct: 61 STNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGCQAESICKEFYISSSSLYRI 120

Query: 121 ISQINKVIKKQFQFEISLTPVQIIGNERDIRYFFAQYFSEKYYFLEWPFENFSVEPLSQL 180
ISQINKVIK+QFQFE+SLTPVQIIGNERDIRYFFAQYFSEKYYFLEWPFENFS EPLSQL
Sbjct: 121 ISQINKVIKRQFQFEVSLTPVQIIGNERDIRYFFAQYFSEKYYFLEWPFENFSSEPLSQL 180

Query: 181 LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSLNALMQAEGIEG 240
LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSL+ LMQAEGIEG
Sbjct: 181 LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSLDFLMQAEGIEG 240

Query: 241 VAQSFESEYNLSLDEEVVCQLFASYFQKMFFIDENLFLKSVKRDSYVGKSYHLLSDFIDQ 300
VAQSFESEYN+SLDEEVVCQLF SYFQKMFFIDE+LF+K VK+DSYV KSYHLLSDFIDQ
Sbjct: 241 VAQSFESEYNISLDEEVVCQLFVSYFQKMFFIDESLFMKCVKKDSYVEKSYHLLSDFIDQ 300

Query: 301 ISVKYQIEIENKDSLIWHLHNTAHLYRQELSTEFILFDQKGNTIRNFQNIFPKFVSDVKK 360
ISVKYQIEIENKD+LIWHLHNTAHLYRQEL TEFILFDQKGNTIRNFQNIFPKFVSDVKK
Sbjct: 301 ISVKYQIEIENKDNLIWHLHNTAHLYRQELFTEFILFDQKGNTIRNFQNIFPKFVSDVKK 360

Query: 361 ELSHYLETLELCSSSMMVNHLSYTFITHTKHLVLNLLQNQPKLKVLVMSNFDQYHAKSVA 420
ELSHYLETLE+CSSSMMVNHLSYTFITHTKHLV+NLLQNQPKLKVLVMSNFDQYHAK VA
Sbjct: 361 ELSHYLETLEVCSSSMMVNHLSYTFITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVA 420

Query: 421 ETLSYYCSNNFELEVWSELELSLESLKDSPYDIIISNFIIPPIENKRLIYSNNVNTVALI 480
ETLSYYCSNNFELEVW+ELELS ESL+DSPYDIIISNFIIPPIENKRLIYSNN+NTV+LI
Sbjct: 421 ETLSYYCSNNFELEVWTELELSKESLEDSPYDIIISNFIIPPIENKRLIYSNNINTVSLI 480

Query: 481 SLLNAMMFIRLD 492
LLNAMMFIRLD
Sbjct: 481 YLLNAMMFIRLD 492


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0548PREPILNPTASE744e-18 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 74.5 bits (183), Expect = 4e-18
Identities = 62/264 (23%), Positives = 95/264 (35%), Gaps = 62/264 (23%)

Query: 6 FFLVGSILASFLGLVIDRFP-------------------------EQSIISPASHCDSCQ 40
FL ++ SFL +VI R P +++ P S C C
Sbjct: 19 VFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCPHCN 78

Query: 41 TPLRPLDLIPILSQVLNRFRCRYCKAPYPVWYALFELSLGLIFLLYFWELL----SLSQV 96
P+ L+ IP+LS + R RCR C+AP Y L EL L+ + L +L+ +
Sbjct: 79 HPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGTLAAL 138

Query: 97 ILITAGLTLGIYDFRHQEYP-------LLVWVVFHLLLMV--------------CSDWNL 135
+L + L D P L ++F+LL W+L
Sbjct: 139 LLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVLWSL 198

Query: 136 VMVFFLVLGILAHFIDIRMGAGDFLFLASCALVFSATELLILIQFASATGILAFLLQKKK 195
F L+ G MG GDF LA+ L I++ +S G +
Sbjct: 199 YWAFKLLTGKEG------MGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILL 252

Query: 196 ER------LPFVPFLLLAACVIIF 213
+PF P+L +A + +
Sbjct: 253 RNHHQSKPIPFGPYLAIAGWIALL 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0541PERTACTIN310.010 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 31.2 bits (70), Expect = 0.010
Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 41 ILAYNPVFEIKFENGVLYQNGQVIDRDPLDFLYEVTHKS 79
+L NP E++F+NG + +GQ+ D FL VT K+
Sbjct: 79 VLLENPAAELRFQNGSVTSSGQLFDEGVRRFLGTVTVKA 117


26smi_0529smi_0521Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0529018-3.537611ABC transporter, ATP-binding domain/permease,
smi_0528020-4.544978ABC transporter, ATP-binding domain/permease,
smi_0527022-4.821200hypothetical protein
smi_0526021-4.244369hypothetical protein
smi_0525-216-2.245160hypothetical protein
smi_0524-3120.586289transcriptional regulator, ArsR family
smi_0523-3141.567384metal-dependent membrane protease, CAAX amino
smi_0522-3183.178423exodeoxyribonuclease III
smi_0521-2193.113154hypothetical protein
27smi_0477smi_0411Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_04773190.390115hypothetical protein
smi_0476118-0.701398transcriptional regulators, MutR family
smi_0475222-0.250987transcriptional regulatory protein
smi_04742190.226769hypothetical protein
smi_04733190.028487CH60_SMI 60 kDa chaperonin (protein Cpn60)
smi_04724190.569932CH10_SMI 10 kDa chaperonin (protein Cpn10)
smi_04714180.255112single-stranded DNA-binding protein
smi_04703180.765016transposase, ISSmi2
smi_04691160.113723N-acetylmuramoyl-L-alanine amidase LytA
smi_0468015-0.488728antiholin
smi_04670160.867006hypothetical protein
smi_0466-1160.827893hypothetical protein
smi_0465-1160.914588hypothetical protein
smi_04640151.005983hypothetical protein
smi_04632151.288824hypothetical protein
smi_04623151.172484hypothetical protein
smi_04611190.340761structural phage protein
smi_04602200.142181tail fiber protein
smi_04592200.548936hypothetical protein
smi_04582220.477730phage Mu protein gp47
smi_04574190.060979hypothetical protein
smi_04564200.866471hypothetical protein
smi_0455522-0.508811hypothetical protein
smi_0454525-1.084036LysM
smi_0453527-2.636271tail length tape measure protein
smi_0452525-2.894041hypothetical protein
smi_0451123-4.527524core tail protein
smi_0450121-3.300151sheath tail protein
smi_0449220-2.285935hypothetical protein
smi_0448218-1.047858hypothetical protein
smi_0447318-0.767661prophage pi2 protein
smi_0446319-0.613281hypothetical protein
smi_0445319-0.095533hypothetical protein
smi_04443190.408396hypothetical protein
smi_04433221.454217main capsid protein Gp34-like protein
smi_0442120-0.248617methyl-accepting chemotaxis protein scaffolding
smi_0440021-0.756838hypothetical protein
smi_0439321-0.186663hypothetical protein
smi_0438218-1.903335hypothetical protein
smi_0437219-4.743976hypothetical protein
smi_0436124-5.658502hypothetical protein
smi_0435223-4.897569NAD+-asparagine ADP-ribosyltransferase
smi_0433227-4.654435serine protein kinase
smi_0432231-5.773340terminase large subunit
smi_0431230-4.290063small terminase
smi_0430422-2.263248*hypothetical protein
smi_0429420-1.718807transcriptional regulator, AbrB family
smi_0428319-1.839748transcriptional regulator
smi_0427320-1.448662hypothetical protein
smi_0426421-1.148773hypothetical protein
smi_0425420-1.264879DNA-binding protein
smi_0424221-1.103714hypothetical protein
smi_0423423-1.106528hypothetical protein
smi_0422421-0.284862hypothetical protein
smi_0421221-1.135941prophage Pi3 protein
smi_0420120-2.193511hypothetical protein
smi_0419026-2.796054hypothetical protein
smi_0418026-5.154001hypothetical protein
smi_0417025-4.793535hypothetical protein
smi_0416026-5.504119hypothetical protein
smi_0415029-5.038068hypothetical protein
smi_0414026-4.660824hypothetical protein
smi_0413026-4.038088Helicase
smi_0412327-3.268380ATPase involved in DNA replication initiation
smi_0411220-2.082502hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0473BCTERIALGSPD260.031 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 25.6 bits (56), Expect = 0.031
Identities = 14/58 (24%), Positives = 27/58 (46%), Gaps = 2/58 (3%)

Query: 12 FLSLIPVIGLYFSMKDRATKQENRLTVLE-KDIENLNEFKR-SANKRLDNHDEQNKAI 67
L IPVIG F + + N + + I + +E+++ S+ + +D Q+K
Sbjct: 562 LLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQR 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0462cloacin358e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 35.5 bits (81), Expect = 8e-04
Identities = 28/86 (32%), Positives = 35/86 (40%), Gaps = 10/86 (11%)

Query: 665 GSVGEAFNNGYKFGQG-IDKAVGGFFKGAGDSNGAG----NNFLGD-QGTTPYELSPANS 718
G G N G G I+ G G G S+G+G NN G G+ +
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHW----GG 58

Query: 719 APGQGDGGQGGGGGGHNPTGGKLDEV 744
G G+GG G GG + TGG L V
Sbjct: 59 GSGHGNGGGNGNSGGGSGTGGNLSAV 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0445SALSPVBPROT463e-07 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 45.9 bits (108), Expect = 3e-07
Identities = 47/180 (26%), Positives = 81/180 (45%), Gaps = 26/180 (14%)

Query: 399 KWLKNLSSYEVDSIHEYTTAMYEDYNHVLR-------EGKQGFLDKITGGSSQKLSDETK 451
KW S ++ ++ Y+ Y N LR + K+ L + +++ +E K
Sbjct: 385 KWAIVEESKQIQALRYYSAQGYSVINKYLRGDDYPETQAKETLLSRDYLSTNEPSDEEFK 444

Query: 452 K----WYNDIEKKSEHIISAISTYKAEKTFKTYRLFNQLEDDFL--VNAVGKTLVIDKGF 505
+ NDI + +S++ ++ +L D L +G ++IDK F
Sbjct: 445 NAMSVYINDIAEG----LSSLPETDHRVVYRGLKLDKPALSDVLKEYTTIG-NIIIDKAF 499

Query: 506 MSTSLDKAVIEEFGGGDVEIQLNILIKKGQSVGAYIGELSNYANEKEFIIKPNTRFKILS 565
MSTS DKA I + LNI ++KG G +G+++++ E E + PNT+ KI S
Sbjct: 500 MSTSPDKAWIN-------DTILNIYLEKGHK-GRILGDVAHFKGEAEMLFPPNTKLKIES 551


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0422SECA330.003 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.003
Identities = 20/63 (31%), Positives = 31/63 (49%)

Query: 210 YERLAKGKQAIVYTHSVEYAERVAKRFIEQGYQSAVVSGKTPQSERESHMQAFREGELTI 269
ER AKG+ +V T S+E +E V+ + G + V++ K +E QA +TI
Sbjct: 443 KERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAAVTI 502

Query: 270 MVN 272
N
Sbjct: 503 ATN 505


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0418CHLAMIDIAOMP289e-04 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 28.0 bits (62), Expect = 9e-04
Identities = 9/23 (39%), Positives = 13/23 (56%)

Query: 3 LWKCTCADCGREFDWYDNYPPLE 25
LW+C CA G F + + P +E
Sbjct: 201 LWECGCATLGASFQYAQSKPKVE 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0414PF05932270.003 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 26.7 bits (59), Expect = 0.003
Identities = 5/29 (17%), Positives = 10/29 (34%)

Query: 7 RLNVEVSGIEELKEACKEVSKKAEELQEA 35
+ E + LK + + +EA
Sbjct: 97 SIPREKLSVPTLKREMAGLLEWMRGWREA 125


28smi_0396smi_0366Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0396125-5.016410phenylalanyl-tRNA synthetase, beta subunit
smi_0395127-6.136453hypothetical protein
smi_0394-114-2.849884hypothetical protein
smi_0393014-3.159218choline-binding protein Cbp5
smi_0392015-2.508402CTP synthase (UTP--ammonia ligase)
smi_0391-113-1.504010DNA-dependent RNA Polymerase, delta subunit
smi_03900151.681199transposase, ISSmi1
smi_03892263.986292ABC transporter permease
smi_03880182.212123ABC transporter ATP-binding protein
smi_0387-1203.439591TetR family transcriptional regulator
smi_0386-2182.267978hypothetical protein
smi_0385-1192.453039DNA or RNA helicases of superfamily II
smi_03842213.874945cell filamentation protein Fic-related protein
smi_03831183.854113hypothetical protein
smi_03824233.575535molecular chaperone, HSP90 family
smi_03815263.332367type II DNA modification enzyme
smi_03804263.519821valyl-tRNA synthetase
smi_03793253.427973shikimate kinase
smi_0378-2232.076770acetyltransferase, GNAT family
smi_0377-2191.580036hypothetical protein
smi_0376-2181.351118Helicase
smi_0375017-0.548702hypothetical protein
smi_0374118-1.771050hypothetical protein
smi_0373320-3.042552hypothetical protein
smi_0372623-4.542846hypothetical protein
smi_0371622-3.970513ribosome-binding factor A
smi_0370724-2.824793Translation initiation factor
smi_0369625-2.265484L7A family ribosomal protein
smi_0368626-0.958119nucleic-acid-binding protein implicated in
smi_03674270.405506transcription termination-antitermination
smi_03663261.102602hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0396HTHTETR447e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.2 bits (104), Expect = 7e-08
Identities = 15/76 (19%), Positives = 38/76 (50%)

Query: 1 MPPKVKFSKEAIIGTALQLVREEGMASLTARALAEQLGATPRVIFGQFANMAELQAEVIS 60
+ + +++ I+ AL+L ++G++S + +A+ G T I+ F + ++L +E+
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 AAEMVVVEYIRKALED 76
+E + E +
Sbjct: 65 LSESNIGELELEYQAK 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0391GPOSANCHOR320.010 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.010
Identities = 15/101 (14%), Positives = 25/101 (24%), Gaps = 5/101 (4%)

Query: 391 LINILKPIISKLISERNKIATQISAEDHMEDERRLEIEREERRKEREARRQEEREKRIAE 450
+ L +S + K +S + E E+ E
Sbjct: 86 HNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALE-----GAMNFSTADS 140

Query: 451 QKAEILEQKNEHLIETNAQLETQNTLQKVMLQEKDPEKQEL 491
K + LE + L A LE + + L
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0389RTXTOXIND320.012 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.012
Identities = 11/73 (15%), Positives = 26/73 (35%), Gaps = 6/73 (8%)

Query: 806 YLPLADLLNVEEELARLDKELAKWQKELDMVGKKLSNERFVANAKPEVVQKERDKQADYQ 865
+ +L E + EL ++ +L+ + ++ +AK E + + +
Sbjct: 248 AIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI------LSAKEEYQLVTQLFKNEIL 301

Query: 866 AKYDATVARIDEM 878
K T I +
Sbjct: 302 DKLRQTTDNIGLL 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0379TCRTETOQM863e-19 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 85.7 bits (212), Expect = 3e-19
Identities = 45/139 (32%), Positives = 63/139 (45%), Gaps = 18/139 (12%)

Query: 442 IMGHVDHGKTTLLDTLRNSRVATGEAG------------------GITQHIGAYQIVENG 483
++ HVD GKTTL ++L + A E G GIT G
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 484 KKITFLDTPGHAAFTSMRARGASVTDITILVVAADDGVMPQTIEAINHSKAANVPIIVAI 543
K+ +DTPGH F + R SV D IL+++A DGV QT + + +P I I
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 544 NKIDKPGANPERVIGELAE 562
NKID+ G + V ++ E
Sbjct: 128 NKIDQNGIDLSTVYQDIKE 146


29smi_0340smi_0334Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_03401293.552810glutamine synthetase type 1
smi_03390283.812151transcriptional repressor of the glutamine
smi_03382233.187106hypothetical protein
smi_03370243.806062phosphoglycerate kinase
smi_03360193.455939endo-beta-N-acetylglucosaminidase D
smi_03351193.317940Na/Pi cotransporter II-related protein
smi_03341173.125380hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0340BONTOXILYSIN260.015 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 26.0 bits (57), Expect = 0.015
Identities = 11/57 (19%), Positives = 23/57 (40%), Gaps = 7/57 (12%)

Query: 1 MKRILIAPVRFY---QRFISPAFPPSCRFEPTCSNYMIEAIEKHGF-KGVLMGLARI 53
I +AP R+Y ++ N++ + E+ F + +++ L RI
Sbjct: 36 APNIWVAPERYYGEPLDIAEEYKLDGGIYDS---NFLSQDSERENFLQAIIILLKRI 89


30smi_0318smi_0277Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0318-2163.393856ABC transporter permease
smi_0317-3153.376018ABC transporter substrate-binding protein
smi_0316-2162.364678peptide ABC transporter ATP-binding protein
smi_0315-3142.544488peptide ABC transporter ATP-binding protein
smi_0314-3130.609622peptide ABC transporter permease
smi_0313-413-1.014021peptide ABC transporter permease
smi_0312-117-3.108028ABC transporter substrate-binding
smi_0311-119-4.739360*CBS domain membrane protein
smi_0310024-5.957843DNA-entry nuclease (competence-specific
smi_0309125-6.153048hypothetical protein
smi_0308117-3.261889UDP-N-acetylglucosamine-1-
smi_0307014-1.845033hypothetical protein
smi_0306-2140.485488phosphopantetheine adenylyltransferase
smi_0305-1162.147974methyltransferase
smi_0304-1233.910465aspartate--ammonia ligase (asparagine synthetase
smi_0303-2234.121211hypothetical protein
smi_0302-1244.311557rRNA/tRNA methylase
smi_0301-2234.490175oxaA2
smi_0300-3224.490668histidine kinase
smi_0299-3234.949926response regulator
smi_0298-3224.458582ABC transporter permease
smi_0297-3204.307540ABC transporter permease
smi_0296-2173.127220ABC transporter ATP-binding protein
smi_0295-2173.330232pyruvate-formate lyase activating enzyme
smi_0294-2162.764245diaminopimelate decarboxylase
smi_0293-2162.125524repressor of purine biosynthetic genes
smi_0292-2192.097889transposase, ISSmi1
smi_0291-3141.569169cmp-binding factor 1
smi_0290-2163.074476hypothetical protein
smi_0289-2152.494006thiamine pyrophosphokinase
smi_0288-1162.267728pentose-5-phosphate-3-epimerase
smi_0287-2172.859324hypothetical protein
smi_0286-2152.73089016S rRNA
smi_0285-2163.393272hypothetical protein
smi_0284-2163.177984hypothetical protein
smi_0283-2202.788441amino acid permease
smi_0282-1162.815581substrate-binding lipoprotein, oligopeptide
smi_0281-1132.33048050S ribosomal protein L34
smi_0280-2132.757672aspartate aminotransferase
smi_0279-1172.129341hypothetical protein
smi_02780252.375316hypothetical protein
smi_02770353.184841hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0315LPSBIOSNTHSS1502e-49 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 150 bits (381), Expect = 2e-49
Identities = 55/155 (35%), Positives = 96/155 (61%), Gaps = 1/155 (0%)

Query: 5 IGLFTGSFDPMTNGHLDMIERASKLFDKLYVGIFFNPHKQGFLPLENRKRGLEKAVKHLE 64
++ GSFDP+T GHLD+IER +LFD++YV + NP+KQ ++ R + KA+ HL
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 65 NVKVVSSHDELVVDVAKRLGATCLVRGLRNAADLQYEASFDYYNHQLSPNIETIYLHSRS 124
N +V S L V+ A++ A ++RGLR +D + E N L+ ++ET++L + +
Sbjct: 62 NAQVDSFEG-LTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 125 EHLYISSSGVRELLKFGQDIACYVPESILEEIRNE 159
E+ ++SSS V+E+ +FG ++ +VP + + ++
Sbjct: 121 EYSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQ 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_031060KDINNERMP1141e-30 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 114 bits (287), Expect = 1e-30
Identities = 60/225 (26%), Positives = 108/225 (48%), Gaps = 21/225 (9%)

Query: 35 GFIWNTIGAPMAEAIKYFANDKGLGFGVGIIIVTIIVRLIILPLGIYQSWKATLHSEKMN 94
G++W I P+ + +K+ + G +G III+T IVR I+ PL + + KM
Sbjct: 331 GWLW-FISQPLFKLLKWIHSFVG-NWGFSIIIITFIVRGIMYPLT-KAQYTSMA---KMR 384

Query: 95 ALKHVLEPHQTRLKEATTQEEKLEAQQALFAAQKEHGISMFGGIGCFPVLLQMPFFSAIY 154
L +P ++E +++ Q + A K ++ GG CFP+L+QMP F A+Y
Sbjct: 385 ML----QPKIQAMRERLGDDKQ-RISQEMMALYKAEKVNPLGG--CFPLLIQMPIFLALY 437

Query: 155 FAAQHTEGVAQASYLG----IPLGSPSMILVAFAGILYYLQSLLSLHGVEDEMQREQIKK 210
+ + + QA + + P IL G+ + +S V D MQ +K
Sbjct: 438 YMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQ----QK 493

Query: 211 MIYMSPLMIVVFSLFSPASVTLYWVVGGFMMILQQFIVNYIVRPK 255
++ P++ VF L+ P+ + LY++V + I+QQ ++ + +
Sbjct: 494 IMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKR 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0308HTHFIS763e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.0 bits (187), Expect = 3e-18
Identities = 31/120 (25%), Positives = 58/120 (48%), Gaps = 3/120 (2%)

Query: 4 KILVVDDDLAILRLIKKVLEYEKYEVVIRNKIEE-IDLCDFIGFDLILLDIMMP-VSGLE 61
ILV DDD AI ++ + L Y+V I + DL++ D++MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ICQMIRE-QITVPICFITAKDMDEDLVAGINAGADDYIMKPFSMQELLARVKMHLRREER 120
+ I++ + +P+ ++A++ + GA DY+ KPF + EL+ + L +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0305PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.003
Identities = 12/39 (30%), Positives = 16/39 (41%)

Query: 27 AKAGRVTAFLGPNGAGKSSTLRILLGLDKATSGLTKIGD 65
K G G GKS+ + L+GLD + IG
Sbjct: 593 CKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0297FLGHOOKAP1280.047 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 27.6 bits (61), Expect = 0.047
Identities = 8/31 (25%), Positives = 14/31 (45%)

Query: 11 LAADYANFEREIKRLEATGAEYAHIDIMDGH 41
A A+ +I RL GA + +++D
Sbjct: 171 YAKQIASLNDQISRLTGVGAGASPNNLLDQR 201


31smi_0268smi_0263Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0268-1183.362565xanthine/uracil permease
smi_02670193.502603hypothetical protein
smi_02661203.319739alcohol dehydrogenase, propanol-preferring,
smi_02651203.456677phosphotransferase system, mannose-specific
smi_02641193.709522phosphotransferase system, mannose-specific,
smi_02634253.001735mannose-specific phosphotransferase system
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0263TCRTETOQM6200.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 620 bits (1600), Expect = 0.0
Identities = 181/667 (27%), Positives = 296/667 (44%), Gaps = 57/667 (8%)

Query: 9 KTRNIGIMAHVDAGKTTTTERILYYTGKIHKIGETHEGASQMDWMEQEQERGITITSAAT 68
K NIG++AHVDAGKTT TE +LY +G I ++G +G ++ D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAQWNNHRVNIIDTPGHVDFTIEVQRSLRVLDGAVTVLDSQSGVEPQTETVWRQATEYGV 128
+ QW N +VNIIDTPGH+DF EV RSL VLDGA+ ++ ++ GV+ QT ++ + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFANKMDKIGADFLYSVSTLHDRLQANAHPIQLPIGSEDDFRGIIDLIKMKAEIYTN 188
P I F NK+D+ G D + ++L A +IK K E+Y N
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEI------------------VIKQKVELYPN 163

Query: 189 DLGTDILEEDIPAEYLDQAQEYREKLVEAVAETDEELMMKYLEGEEITNEELKAGIRKAT 248
T+ E + + V E +++L+ KY+ G+ + EL+
Sbjct: 164 MCVTNFTESE---------------QWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRF 208

Query: 249 INVEFFPVLCGSAFKNKGVQLMLDAVIDYLPSPLDIPAIKGINPDTDEEETRPASDEEPF 308
N FPV GSA N G+ +++ + + S +
Sbjct: 209 HNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSEL 249

Query: 309 AALAFKIMTDPFVGRLTFFRVYSGVLQSGSYVLNTSKGKRERIGRILQMHANSRQEIDTV 368
FKI RL + R+YSGVL V + K K +I + +ID
Sbjct: 250 CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKA 308

Query: 369 YSGDIAAAVGLKDTTTGDSLTDEKAKIILESINVPEPVIQLMVEPKSKADQDKMGIALQK 428
YSG+I + L D K E I P P++Q VEP ++ + AL +
Sbjct: 309 YSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLE 367

Query: 429 LAEEDPTFRVETNVETGETVISGMGELHLDVLVDRMRREFKVEANVGAPQVSYRETFRAS 488
+++ DP R + T E ++S +G++ ++V ++ ++ VE + P V Y E R
Sbjct: 368 ISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME--RPL 425

Query: 489 TQARGFFKRQSGGKGQFGDVWIEFTPNEEGKGFEFENAIVGGVVPREFIPAVEKGLVESM 548
+A + + + + +P G G ++E+++ G + + F AV +G+
Sbjct: 426 KKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGIRYGC 485

Query: 549 ANGVLAGYPMVDVKAKLYDGSYHDVDSSETAFKIAASLALKEAAKSAQPAILEPMMLVTI 608
G L G+ + D K G Y+ S+ F++ A + L++ K A +LEP + I
Sbjct: 486 EQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYLSFKI 544

Query: 609 TVPEENLGDVMGHVTARRGRVDGMEAHGNSQIVRAYVPLAEMFGYATVLRSASQGRGTFM 668
P+E L + + N I+ +P + Y + L + GR +
Sbjct: 545 YAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGRSVCL 604

Query: 669 MVFDHYE 675
Y
Sbjct: 605 TELKGYH 611


32smi_0236smi_0231Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
smi_02363231.483579histidine kinase
smi_02353221.327547hypothetical protein
smi_02343250.567543ABC transporter ATP-binding protein
smi_02335350.620050hypothetical protein
smi_02325371.862735hypothetical protein
smi_02313301.520649**********transposase, ISSmi1
33smi_0195smi_0180Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0195223-0.497862hypothetical protein
smi_0194120-1.543537alcohol dehydrogenase
smi_0193-221-4.150947N-acetylglucosamine-6-phosphate deacetylase
smi_0192025-7.157900transposase, ISSmi1
smi_0191331-7.685199hypothetical protein
smi_0190232-7.647648tRNA-guanine transglycosylase (guanine insertion
smi_0189130-7.090474hypothetical protein
smi_0188130-7.285077pyrrolidone-carboxylate peptidase
smi_0187128-7.343887hypothetical protein
smi_0186126-6.784016hypothetical protein
smi_0185127-7.563796multi antimicrobial extrusion (MATE) family
smi_0184128-7.721850threonine synthase
smi_0183229-8.153022hypothetical protein
smi_0182329-8.557292hypothetical protein
smi_0181230-7.612184hypothetical protein
smi_0180-224-5.169730hypothetical protein
34smi_0115smi_0100Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0115218-1.012525ABC transporter ATP-binding protein
smi_0114219-1.133506ABC transporter ATP-binding protein
smi_0113224-3.246360hypothetical protein
smi_0112233-6.150295hypothetical protein
smi_0111126-4.112246transcriptional regulator
smi_0110-123-3.645215hypothetical protein
smi_0109020-3.291068histidyl-tRNA synthetase
smi_0108-119-3.627783hypothetical protein
smi_0107.1-123-0.945162positive transcriptional regulator of
smi_0107-2170.302049hypothetical protein
smi_0106013-1.139842hypothetical protein
smi_0105013-1.435481dihydroxyacid dehydratase
smi_0104114-1.607318transketolase c-terminal section
smi_0103015-0.921592transketolase n-terminal section
smi_0102016-0.267541PTS system, IIC component
smi_0101218-2.821181PTS system, IIB component
smi_0100330-2.724533transcriptional regulator, BglG family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0101RTXTOXIND290.029 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.029
Identities = 33/201 (16%), Positives = 66/201 (32%), Gaps = 28/201 (13%)

Query: 74 KDIQQLEGTLVEK--GSESYKSLANQVLIELREIHQEADRLKS----------------- 114
K+I+ +E ++V++ E VL++L + EAD LK+
Sbjct: 97 KEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQIL 156

Query: 115 --YIDADVYNRIDKKVRTVRANI---DVQLERLDRESQVDLENAETEELAPELSQTLANI 169
I+ + + N+ +V + Q + + L + A
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 170 AIDHQAILDKIATSAEGDKEELTAIHSLKMEKF---KTILEGYLKIKANPKNYNRAEERL 226
A +++ + +K L SL ++ +LE K + +L
Sbjct: 217 LT-VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQL 275

Query: 227 EQAKAAIEQFDLELDQVLREL 247
EQ ++ I E V +
Sbjct: 276 EQIESEILSAKEEYQLVTQLF 296


35smi_2088smi_2077N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_2088125-7.741181serine protease
smi_2087128-6.87817423S rRNA
smi_2085126-7.612612*competence stimulating peptide (CSP) precursor
smi_2084122-6.596533histidine kinase
smi_2083021-5.693325response regulator
smi_2080-122-4.451299**hypothetical protein
smi_2079-121-4.542491hypothetical protein
smi_2078-119-4.425593hypothetical protein
smi_2077-119-2.997563ABC-F family ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2088V8PROTEASE621e-12 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 61.6 bits (149), Expect = 1e-12
Identities = 31/165 (18%), Positives = 58/165 (35%), Gaps = 34/165 (20%)

Query: 117 IVTNNHVINGASKVDIRLS------------DGTKVPGEIVGADTFSDIAVVKISSEKVT 164
++TN HV++ L +G +I D+A+VK S +
Sbjct: 114 LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQN 173

Query: 165 -------TVAEFGDSSKLTVGETAIAIGSPLG-SEYANTVTQGIVSSLNRNVSLKSEDGQ 216
A ++++ V + G P ++G ++
Sbjct: 174 KHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITY------------- 220

Query: 217 AISTKAIQTDTAINPGNSGGPLINIQGQVIGITSSKIATNGGTSV 261
+ +A+Q D + GNSG P+ N + +VIGI + +V
Sbjct: 221 -LKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNEFNGAV 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_208756KDTSANTIGN270.037 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.2 bits (60), Expect = 0.037
Identities = 16/46 (34%), Positives = 22/46 (47%), Gaps = 1/46 (2%)

Query: 14 KYLKDGIAEYSKRISRFAKLEMIELADEKTPDRASESENQ-KILEI 58
K L D I + I FA + I + D P+ AS + Q KI E+
Sbjct: 262 KVLSDKIIQIYSDIKPFADIAGINVPDTGLPNSASIEQIQSKIQEL 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2080HTHTETR491e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.2 bits (117), Expect = 1e-09
Identities = 22/104 (21%), Positives = 44/104 (42%), Gaps = 8/104 (7%)

Query: 6 KRLKTKRTIENAMVQLLMEQPFDQISTVKLAEKAGISRSSFYTHYKDKYDMIEHYQSKLF 65
+ +T++ I + ++L +Q S ++A+ AG++R + Y H+KDK D+
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 66 HTF-EYIFQKHAHHK-------RDAILEVFEYLESEPLLAALLS 101
E + A R+ ++ V E +E L+
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2079ICENUCLEATIN340.003 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 34.0 bits (77), Expect = 0.003
Identities = 59/181 (32%), Positives = 74/181 (40%), Gaps = 33/181 (18%)

Query: 479 STSLTGLSSGLTEIQGTLTSKLVPASQSITSGVNAY-TAGVDK---VSQGASQLSEKNST 534
ST G S LT G+ ++ S ++T+G + TAG D G+S S S
Sbjct: 982 STQTAGYQSTLTAGYGS--TQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSF 1039

Query: 535 LTGSLDQLVSGSTTLTQKSSKLTAGVGQLVEKTPELVSGIEKLST---GSNQLNQKSQEL 591
LT GST ++ S LTAG G L+SG T GSNQ+ L
Sbjct: 1040 LTAGY-----GSTLISGLRSVLTAGYGS------SLISGRRSSLTAGYGSNQIASHRSSL 1088

Query: 592 IAGVDKLQ-----------SGSSQLADKSSQLISGAS--QLESGANKLADGAGKLAEGGT 638
IAG + Q GSSQ A S LISGA Q+ KL GA G
Sbjct: 1089 IAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGD 1148

Query: 639 K 639
+
Sbjct: 1149 R 1149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_2077PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 11/30 (36%), Positives = 14/30 (46%)

Query: 32 LIGANGAGKSTFLKILAGDIEPTTGHISLG 61
L G G GKST + L G + H +G
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


36smi_1988smi_1978N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_19882191.788097hypothetical protein
smi_1987-1182.273785hypothetical protein
smi_1986-1192.996632hypothetical protein
smi_1985-1203.664141hypothetical protein
smi_1984-1203.560401transcriptional regulator
smi_1983-1153.030792cyanate permease
smi_1982-2172.279071glycosyl transferase
smi_1981-1192.022990nucleoside-diphosphate sugar isomerase
smi_1980020-1.078808phosphatase
smi_1979123-5.331153L-serine dehydratase, alpha subunit
smi_1978229-7.181608L-serine dehydratase, beta subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1988IGASERPTASE424e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 4e-06
Identities = 29/165 (17%), Positives = 58/165 (35%), Gaps = 14/165 (8%)

Query: 7 ESNDFVKTSSKNKPDEQAQDGADKAEETIPDLDTPIEKNTQLEKEVSQAEAELESQQEEK 66
++ + K + N + ++ + T K T ++ +A+ E E QE
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 67 IETPEDS--EAKTETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLL 124
T + S + ++ET + +A + E +P +E Q + T +
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDP--------TVNIKEPQSQTNTTADTEQPAKET 1175

Query: 125 SKSLESPYIPDQAQKSTDRWKEQVLDFWSWLVEALKSPTSKLETS 169
S ++E P + + E + A PT E+S
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVENPEN----TTPATTQPTVNSESS 1216



Score = 36.6 bits (84), Expect = 2e-04
Identities = 26/133 (19%), Positives = 51/133 (38%), Gaps = 2/133 (1%)

Query: 9 NDFVKTSSKNKP-DEQAQDGADKAE-ETIPDLDTPIEKNTQLEKEVSQAEAELESQQEEK 66
N V T++ P + QA + + E I +D E E+ ++E
Sbjct: 989 NQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQES 1048

Query: 67 IETPEDSEAKTETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLLSK 126
++ + TET + + E + ++ T+ A+ E Q T TKE + K
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 127 SLESPYIPDQAQK 139
++ ++ Q+
Sbjct: 1109 EEKAKVETEKTQE 1121



Score = 33.5 bits (76), Expect = 0.002
Identities = 32/159 (20%), Positives = 57/159 (35%), Gaps = 21/159 (13%)

Query: 20 PDEQAQDGADKAEETIPDLDTPIEKNTQ--LEKEVSQAEAELESQQEEKIETPEDSEAKT 77
P E + E +EKN Q E E E++ K T + A++
Sbjct: 1033 PSETTE----TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS 1088

Query: 78 ETEEKKALDSTEEEPDLSKETEKVTKAEENQEALSQQKTTTKEPLLLSKSLESPYIPDQA 137
+E K+ + +KET V K EE + +++ T + P + S+ +
Sbjct: 1089 GSETKET------QTTETKETATVEK-EEKAKVETEK--TQEVPKVTSQVSPKQEQSETV 1139

Query: 138 QKSTDRWKEQVLDFWSWLVEALKSPTSKLETSSTHSYTA 176
Q + +E +K P S+ T++ A
Sbjct: 1140 QPQAEPAREND------PTVNIKEPQSQTNTTADTEQPA 1172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1986TCRTETB280.025 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.025
Identities = 19/103 (18%), Positives = 36/103 (34%), Gaps = 4/103 (3%)

Query: 76 NRQILHIALLAL---LAAPIGIPLGIAILVSL-FAILVAALTVILAFFAVSILGIIGGFL 131
N +L+++L + P + L F+I A + + L + G +
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIII 88

Query: 132 FLVESFTVLAQAKSAFILIFGSGLLAIGASSLVLLGISYVARF 174
S +LI + GA++ L + VAR+
Sbjct: 89 NCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1984TCRTETA445e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 5e-07
Identities = 64/360 (17%), Positives = 118/360 (32%), Gaps = 17/360 (4%)

Query: 6 LFFVPGIILIGVSLRTPFTVLPIILGDISQGLGVEVSSLGVLTSLPLLMFTLFSLFSTRL 65
+ + +G+ L P VLP +L D+ + G+L +L LM + L
Sbjct: 10 ILSTVALDAVGIGLIMP--VLPGLLRDLVHS-NDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 66 AQKIGLEHFFTYSLFFLTIGSLIRLI--NLPLLYLGTL---MVGASIAVINVLLPSLIQA 120
+ + G SL + I L +LY+G + + GA+ AV + +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 121 NQ-PKKIGFLTTLYVTSMGIATALASYLAVPITQASSWKGLILLLTLLCLATFLVWLP-- 177
++ + GF++ + M L + A + L FL+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 178 -NHRYNHRLAPQTKQKSQTKVMHNKQVWAVIVFAGFQSLLFYTAMTWLPTMAIHAGLSSH 236
R R A + + VF Q + A W+ +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDAT 246

Query: 237 EAGLLTSIFSLISIPFSMTIPSLTTSLSTRNRQLMLTLVSLAGMVGISMLFFPVGNFFYW 296
G+ + F ++ I + R LML + +A G +L F W
Sbjct: 247 TIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGM--IADGTGYILLAFATR---GW 301

Query: 297 LAIHLLIGTATSALFPYLMVNFSLKTSAPEKTAQLSGLSQTGGYILAAFGPTLFGYSFDL 356
+A +++ A+ + + + E+ QL G + + GP LF +
Sbjct: 302 MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1983BCTERIALGSPF385e-05 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 37.5 bits (87), Expect = 5e-05
Identities = 23/85 (27%), Positives = 40/85 (47%), Gaps = 10/85 (11%)

Query: 186 DLLWLNMIATGAKTGNLDQILCQVRVGAGMFERRGGLRYLKLYRQARQRMLKRGQISYME 245
+ L+ M+A G +G+LD +L ++ A E+R +R + +Q M+
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRL---ADYTEQRQQMR-----SRIQQAMIY--PCVLTV 181

Query: 246 YAKSVAIQMVVALCPGFVRQFIFMK 270
A +V ++ + P V QFI MK
Sbjct: 182 VAIAVVSILLSVVVPKVVEQFIHMK 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1982NUCEPIMERASE819e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 81.0 bits (200), Expect = 9e-19
Identities = 50/284 (17%), Positives = 91/284 (32%), Gaps = 57/284 (20%)

Query: 294 TILVTGAGGSIGSEICRQ----------VSRFNPERIVLLGHGENSIYLVYHELIRKFQG 343
LVTGA G IG + ++ + N V L EL+ +
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQAR-------LELLAQ--- 51

Query: 344 IDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAR 403
+ D+ D + + +F V+ + V NP +N+ G N+
Sbjct: 52 PGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILE 111

Query: 404 AVDEAKVPKMVMIST---------------DKAVNPPNVMGATKRVAELIVTGFNQRSQS 448
K+ ++ S+ D +P ++ ATK+ EL+ ++
Sbjct: 112 GCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171

Query: 449 TYCAVRFGNVLGSRGS---VIPVFERQIAEGGPVTV-TDFRMTRYFMTI----------- 493
+RF V G G + F + + EG + V +M R F I
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 494 -------PEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKM 530
+ + A V+ + PV++ D + +
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1978PF03544300.004 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 30.3 bits (68), Expect = 0.004
Identities = 19/64 (29%), Positives = 24/64 (37%), Gaps = 3/64 (4%)

Query: 66 VHMIYVGQELVIDGPAAPVAPASTTYEAPAAQDE--AVSATVAETIEVEEETPAASGTVA 123
++Y VI+ PA P P S T APA + AV +E E E
Sbjct: 30 AGLLYTSVHQVIELPA-PAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPK 88

Query: 124 EETV 127
E V
Sbjct: 89 EAPV 92


37smi_1540smi_1527N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_15402180.361601ATPase component of ABC transporter with
smi_15392160.261605cation efflux family protein
smi_15382160.277201yybP-ykoY element
smi_1537114-0.326347cation-transporting ATPase, E1-E2 family
smi_15361170.535620polypeptide deformylase
smi_15351190.694375hypothetical protein
smi_15341190.847732hypothetical protein
smi_15332231.316970hypothetical protein
smi_15322221.684477cell wall surface anchor family protein
smi_15312232.587496N-acetyl-beta-hexosaminidase
smi_15300142.525274transposase
smi_1529-1172.955842transposase, IS1167
smi_1528-1183.666721beta-galactosidase
smi_1527-1193.887869hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1540FLGMRINGFLIF303e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.9 bits (67), Expect = 3e-04
Identities = 9/28 (32%), Positives = 14/28 (50%)

Query: 32 KKDKFLSILTSLAGIALVLVAVWLGWPK 59
++ F+ L + LVLV W+ W K
Sbjct: 450 QQQSFIDQLLAAGRWLLVLVVAWILWRK 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1538V8PROTEASE741e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 74.3 bits (182), Expect = 1e-15
Identities = 68/344 (19%), Positives = 127/344 (36%), Gaps = 40/344 (11%)

Query: 86 PKTEEELLAKEKETATSSAVSDTLPEELRGKLNKAEENGRTASKEELEKEDK--SLVPED 143
T L++ A SS D P++ + + + + + + LE+ + ++P +
Sbjct: 15 TLTTATLVSSPAANALSSKAMDNHPQQTQSSKQQTPKIQKGGNLKPLEQREHANVILPNN 74

Query: 144 ----VAKTKNGVLNYGATVEIKSSAG----LGSGIVIGENLVLSVSHNFIKDVPDGNNRK 195
+ T NG +Y I+ A + SG+V+G++ +L+ H D +
Sbjct: 75 DRHQITDTTNG--HYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPH-AL 131

Query: 196 VADNVESDGDVYTVSYKGAPDVKFSKNDVKHWDREGFLKGYKNDLAIVKL------RTPL 249
A + D Y P+ F+ + + EG DLAIVK +
Sbjct: 132 KAFPSAINQDNY-------PNGGFTAEQITKYSGEG-------DLAIVKFSPNEQNKHIG 177

Query: 250 ANAPVEVIDKPSTIKVGDKVHVFGYPKGELDPILNTTVEDINNHGEGVRGISYQGS-EPG 308
+ + +V + V GYP + + + I + Y S G
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKIT--YLKGEAMQYDLSTTGG 235

Query: 309 ASGGGIFDENGKLIGIHQNGVSGKRSGGILFSPAQLEWIQNYIKGIETTKPAGLDALDKQ 368
SG +F+E ++IGIH GV + +G + + ++N++K D
Sbjct: 236 NSGSPVFNEKNEVIGIHWGGVPNEFNGAVFINEN----VRNFLKQNIEDIHFANDDQPNN 291

Query: 369 VEDKEEKPKEDKPQEEKPADNKPAENKPADNKPAENKPADNKPA 412
++ + D P +N N P + +N +DN A
Sbjct: 292 PDNPDNPNNPDNPNNPDEPNNPDNPNNPDNPDNGDNNNSDNPDA 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1537GPOSANCHOR394e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.5 bits (89), Expect = 4e-04
Identities = 17/74 (22%), Positives = 29/74 (39%), Gaps = 4/74 (5%)

Query: 11 HYSIRKFTIGAASVMIGASIFGAGMVQA----AETEGTAETEGTVTQAQPLDKLPADIAA 66
HYS+RK G ASV + ++ GAG+V + ++T+ + DK +
Sbjct: 9 HYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENNT 68

Query: 67 AIEKAEASAGTVDG 80
K +
Sbjct: 69 LKLKNSDLSFNNKA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1534GPOSANCHOR501e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 49.7 bits (118), Expect = 1e-07
Identities = 39/203 (19%), Positives = 72/203 (35%), Gaps = 5/203 (2%)

Query: 11 DKRCHYSIRKFAIGVASVMIGASIFGIS-AVQAEEAASSNTQTEETTVHQAQP-LDKLPD 68
+ HYS+RK G ASV + ++ G V E ++ T+++ T+ + Q DK
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 69 DVAAAIAKADENGGR-EFVKPKSELAEDKVTKDTETTRPANDGSHELASPKVETPNKVEE 127
+ K + + +K ++ ++++ E R + E AS E + +
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 128 GNKAEDKQKSEEANPKPVESAVTAGTEVRDDAKKTSEKDQVKQTTDIKSSSEKTQALSKE 187
KA + + + A K EK + S K + L E
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 188 SSKADVEKEKQLLSDRKQDFNKD 210
KA +E + L +
Sbjct: 185 --KAALEARQAELEKALEGAMNF 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1532PF050432921e-95 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 292 bits (748), Expect = 1e-95
Identities = 201/488 (41%), Positives = 299/488 (61%), Gaps = 2/488 (0%)

Query: 1 MRNLLSTKDQRQLRLMETLIQNRNWLRLHELADKLGCTERILKSDLNELRTAFPTIDIQS 60
MR+LLS K RQL L+E L +++ W ELA+ L CTER +K DL+ +++AFP + S
Sbjct: 1 MRDLLSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHS 60

Query: 61 SINGIMIDLNMQTSVEDVYQHFLAHSQSFQLLEYLFFNEGLPIYRTLENLHSSRANLYRL 120
S NGI I + +E VY HF HS F +LE++FFNEG + + S ++LYR+
Sbjct: 61 STNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGCQAESICKEFYISSSSLYRI 120

Query: 121 GRNITKTLSTQFQIELSFTPSEIRGNEIDIRYFYAQYFSERYYFLDWPFPYIPEEDLTEF 180
I K + QFQ E+S TP +I GNE DIRYF+AQYFSE+YYFL+WPF E L++
Sbjct: 121 ISQINKVIKRQFQFEVSLTPVQIIGNERDIRYFFAQYFSEKYYFLEWPFENFSSEPLSQL 180

Query: 181 ADFFYKITNYPMHFSIYRMYKLMLAISIYRIKNGHFIDLPNH-FYDEYYPLLMGIPNFEE 239
+ YK T++PM+ S +RM KL+L ++YRIK GHF+++ F D+ LM E
Sbjct: 181 LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSLDFLMQAEGIEG 240

Query: 240 TLVYFSEKLGLEITPDIIAQIFISFIQNNLFLDPQEFLNSLEENSEARYSYQLLSQILER 299
F + + + +++ Q+F+S+ Q F+D F+ ++++S SY LLS +++
Sbjct: 241 VAQSFESEYNISLDEEVVCQLFVSYFQKMFFIDESLFMKCVKKDSYVEKSYHLLSDFIDQ 300

Query: 300 LSKQYQITFTNHDELIWHLHNTAFFESQEIFSTPILFEQKTLTIKKFEVYFPDFMASARQ 359
+S +YQI N D LIWHLHNTA QE+F+ ILF+QK TI+ F+ FP F++ ++
Sbjct: 301 ISVKYQIEIENKDNLIWHLHNTAHLYRQELFTEFILFDQKGNTIRNFQNIFPKFVSDVKK 360

Query: 360 ELAQYRQAIGKHDHPEQLEHLMYTILTHAENLSTLLLENRPPIKVLIISNFDHALSLTFV 419
EL+ Y + + + HL YT +TH ++L LL+N+P +KVL++SNFD +
Sbjct: 361 ELSHYLETLEVCSSSMMVNHLSYTFITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVA 420

Query: 420 DMLSYYCNNRFIFDIWDELKTSPEILNQTDYDIIVSNFYIPGI-TKKFICRNHLSIMDLV 478
+ LSYYC+N F ++W EL+ S E L + YDII+SNF IP I K+ I N+++ + L+
Sbjct: 421 ETLSYYCSNNFELEVWTELELSKESLEDSPYDIIISNFIIPPIENKRLIYSNNINTVSLI 480

Query: 479 NHLNTLSN 486
LN +
Sbjct: 481 YLLNAMMF 488


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1531TONBPROTEIN512e-08 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 51.1 bits (122), Expect = 2e-08
Identities = 26/81 (32%), Positives = 31/81 (38%), Gaps = 4/81 (4%)

Query: 2882 PAQPTPNVPIPEVPVK-PVPAQPTPNVPTPEVPVQPTPVVPTPEVPVKPVPAVPEQP--- 2937
+P V P PV P P P E PV P P+ KPV V EQP
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 2938 VVPTPAQPATPVNANPVVSTT 2958
V P ++PA+P T
Sbjct: 114 VKPVESRPASPFENTAPARLT 134



Score = 41.1 bits (96), Expect = 3e-05
Identities = 22/78 (28%), Positives = 26/78 (33%), Gaps = 1/78 (1%)

Query: 2884 QPTPNVPIPEVPVKPVPAQPTPNVPTPEVP-VQPTPVVPTPEVPVKPVPAVPEQPVVPTP 2942
P P PI V P +P V P P V+P P P K P V E+P
Sbjct: 38 LPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 97

Query: 2943 AQPATPVNANPVVSTTVK 2960
+P VK
Sbjct: 98 PKPKPVKKVQEQPKRDVK 115



Score = 39.6 bits (92), Expect = 1e-04
Identities = 26/100 (26%), Positives = 31/100 (31%), Gaps = 9/100 (9%)

Query: 2868 PEVPEVPRLDIPTVPAQPTPNVPIPEVPV---KPVPAQPTPNVPTPEVPVQPTPVVPTPE 2924
V P + P P E PV KP P P +V QP V
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK--- 115

Query: 2925 VPVKPVPAVPEQPVVPTPAQPATPVNA--NPVVSTTVKEN 2962
PV+ PA P + P +T A PV S
Sbjct: 116 -PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPR 154



Score = 33.8 bits (77), Expect = 0.007
Identities = 16/52 (30%), Positives = 19/52 (36%), Gaps = 1/52 (1%)

Query: 2893 EVPVKPVPAQPTPNVPTPEVPVQPTPVVPTPEVPVK-PVPAVPEQPVVPTPA 2943
+V P PAQP ++P V P PV P P P P A
Sbjct: 34 QVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEA 85



Score = 32.7 bits (74), Expect = 0.016
Identities = 30/135 (22%), Positives = 38/135 (28%), Gaps = 17/135 (12%)

Query: 2811 VTPSNDKPVPPTPNMPEGPKFAMPEPPVHELPEFNGGVPGMPEVHELPEFNSGVPGMPEV 2870
VTP++ +P PE PEP P P V E P+ P
Sbjct: 50 VTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-----KEAPVVIEKPKPKPKPKPKPV- 103

Query: 2871 PEVPRLDIPTVPAQPTPNV-PIPEVPVKPVPAQPTPNVPTPEVPVQPTPVVPTPEVPVKP 2929
V QP +V P+ P P + + P V P
Sbjct: 104 --------KKVQEQPKRDVKPVESRPASPFENTAPARLTSS--TATAATSKPVTSVASGP 153

Query: 2930 VPAVPEQPVVPTPAQ 2944
QP P AQ
Sbjct: 154 RALSRNQPQYPARAQ 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1529TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 0.001
Identities = 48/313 (15%), Positives = 106/313 (33%), Gaps = 14/313 (4%)

Query: 43 GLLESIFHTTSLLCEIPSGMLADRYSYKTNLYLSRIAGIVSSILMLAGQGNFWIYALAMA 102
G+L +++ C G L+DR+ + L +S V +M W+ +
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATA-PFLWVLYIGRI 104

Query: 103 VSALSYNFDSGTSAAMVYDSAVEAGLKERYLSISSFLSGVSEGTQSLGTVLAGFFVHGQL 162
V+ ++ +G A + + R+ F+S G VL G
Sbjct: 105 VAGITGA--TGAVAGAYIADITDGDERARHF---GFMSACFGFGMVAGPVLGGLMGGFSP 159

Query: 163 HLTYYIMIATSIIVLFLIWMLKEPSVKVEKADSVTMKQIMWTVKDELKRN-----PMLFN 217
H ++ A + + L S K E+ + + + R ++
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESHKGER-RPLRREALNPLASFRWARGMTVVAALMAV 218

Query: 218 WMILSQIVGVLMCMFYFYYQNQLPDLSGWQISAVMLLGSLLNIVA-VYLASKIGKNYAAL 276
+ I+ + V ++ + +++ I + +L+ +A + +
Sbjct: 219 FFIMQLVGQVPAALWVIFGEDRF-HWDATTIGISLAAFGILHSLAQAMITGPVAARLGER 277

Query: 277 RLFPILVLLTGVTYMLSYFGTPLIYILIYLISNALHALFQPIFDNDLQGRLPSEVRATML 336
R + ++ G Y+L F T ++ A + P L ++ E + +
Sbjct: 278 RALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQ 337

Query: 337 SVYSMMFSLSMIV 349
+ + SL+ IV
Sbjct: 338 GSLAALTSLTSIV 350


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1527FLGPRINGFLGI290.027 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 29.1 bits (65), Expect = 0.027
Identities = 8/21 (38%), Positives = 10/21 (47%)

Query: 31 DILSLTLGEPDFTTPKNIQDA 51
L L L PDF+T + D
Sbjct: 191 VNLVLQLRNPDFSTAVRVADV 211


38smi_1503smi_1496N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1503-2130.133613hypothetical protein
smi_1502-2120.862748cystathionine gamma-synthase
smi_15010110.670862MalY, bifunctional PLP-dependent enzyme
smi_1500-2110.539942abortive infection bacteriophage resistance
smi_1499-1110.333900hypothetical protein
smi_14982130.155235HepA, superfamily II DNA/RNA helicases, SNF2
smi_14971181.772748hypothetical protein
smi_1496-1181.975310UDP-N-acetylmuramate-alanine ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1503BLACTAMASEA290.016 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.6 bits (64), Expect = 0.016
Identities = 15/54 (27%), Positives = 31/54 (57%), Gaps = 1/54 (1%)

Query: 3 EERFPLVSDDEIMLTEMPVMDLYDESDFISNIKGEYRDKNYLEWAPITEEKPAK 56
+ERFP++S +++L V+ D D K YR ++ ++++P++E+ A
Sbjct: 59 DERFPMMSTFKVVLC-GAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLAD 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1501SACTRNSFRASE353e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 3e-05
Identities = 19/76 (25%), Positives = 33/76 (43%), Gaps = 4/76 (5%)

Query: 39 ESIRKCADTFLLARDENKLLGYI-LSSPQSDNPQCLKIHSLVIEADHQRQGLGTLLLAAL 97
+ + L EN +G I + S + I + + D++++G+GT LL
Sbjct: 58 SYVEEEGKAAFLYYLENNCIGRIKIRSNWNGY---ALIEDIAVAKDYRKKGVGTALLHKA 114

Query: 98 KEVAVELDYKGFRLES 113
E A E + G LE+
Sbjct: 115 IEWAKENHFCGLMLET 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1500PHPHTRNFRASE280.025 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 27.8 bits (62), Expect = 0.025
Identities = 14/51 (27%), Positives = 23/51 (45%), Gaps = 3/51 (5%)

Query: 4 RKARLEDLDRIVEIELENFSAEEAIPRSIFEAHLREIQTSFLVAEKEGRIM 54
+ E+L I + + A++A IF AHL + LV +G+I
Sbjct: 48 LEKSKEELRAIKDQTEASMGADKA---EIFAAHLLVLDDPELVDGIKGKIE 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1499PF04647300.015 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 30.1 bits (68), Expect = 0.015
Identities = 10/46 (21%), Positives = 19/46 (41%)

Query: 178 SRRETVKPVKKKKSHLKAFFISLLIFLALISAGGYFGYQYVQSSLL 223
R ++K LK + +++F I A + +Q + LL
Sbjct: 127 PRNLISNTEQRKTLKLKTSMVLMVLFGGSIGAYRLYTHQIALAILL 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1496SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 2e-04
Identities = 16/66 (24%), Positives = 28/66 (42%), Gaps = 2/66 (3%)

Query: 52 IAETFGNWLEIEYLFVTEELRGQGTGSKLLQQAESEAKNRNCRFAFVNTYQFQAP--DFY 109
I + + IE + V ++ R +G G+ LL +A AK + + T FY
Sbjct: 82 IRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFY 141

Query: 110 KRHGYK 115
+H +
Sbjct: 142 AKHHFI 147


39smi_1485smi_1478N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1485011-0.135444NAD+ synthetase
smi_1484012-0.234315acetyltransferases, including N-acetylases of
smi_1483014-0.151128hypothetical protein
smi_1482013-0.109187cytochrome c biogenesis protein CcdA
smi_1481-2142.138330hypothetical protein
smi_1480-2152.213263peptide methionine sulfoxide reductase
smi_1479-2153.309143response regulator
smi_1478-2163.299106histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1485HTHFIS952e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 2e-24
Identities = 35/129 (27%), Positives = 65/129 (50%), Gaps = 6/129 (4%)

Query: 4 TILIVEDEYLVRQGLTKLVNVAAYDMEIIGQAENGRQAWDLIQKQVPDIILTDINMPQLN 63
TIL+ +D+ +R L + ++ A YD+ N W I D+++TD+ MP N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVR---ITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 64 GIQLASLVRETYPQVHLVFLTGYDDFDYALSAVKLGVDDYLLKPFSRQDIEEMLGKIKQK 123
L +++ P + ++ ++ + F A+ A + G DYL KPF D+ E++G I +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF---DLTELIGIIGRA 118

Query: 124 LDKEEKEEQ 132
L + ++
Sbjct: 119 LAEPKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1484PF065802082e-64 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 208 bits (530), Expect = 2e-64
Identities = 58/214 (27%), Positives = 104/214 (48%), Gaps = 13/214 (6%)

Query: 341 NAMLDQIDQLMTAIRKQEETTRQYELQALSSQINPHFLYNTLDTIIWMAEFQDSQRVVQV 400
N +IDQ K ++ +L AL +QINPHF++N L+ I + +D + ++
Sbjct: 143 NYKQAEIDQW-----KMASMAQEAQLMALKAQINPHFMFNALNNIRALIL-EDPTKAREM 196

Query: 401 TKSLATYFRLAL-NQGKDLICLSDEINHVRQYLFIQKQRYGDKLEYEIDEDSTFDNLVLP 459
SL+ R +L + L+DE+ V YL + ++ D+L++E + ++ +P
Sbjct: 197 LTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVP 256

Query: 460 KLVLQPLVENALYHGIKEKEGQGHIRVSVQKQDSELVIRIEDDGVGFQDVGDSSQSQLKR 519
+++Q LVEN + HGI + G I + K + + + +E+ G S
Sbjct: 257 PMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES------ 310

Query: 520 GGVGLQNVDQRLKLHFGEHYQMKIDSIPSKGTTV 553
G GLQNV +RL++ +G Q+K+ K +
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1482IGASERPTASE392e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 2e-04
Identities = 56/349 (16%), Positives = 110/349 (31%), Gaps = 24/349 (6%)

Query: 198 PSRPDSAEQPQLPVEKQSANEIEK--TAVVQE----RPAVAPETIVENPVVETPVAPAVE 251
+ Q +P + EI + A V P+ ET+ EN E+ E
Sbjct: 996 NITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 252 ENPVSEKPQTRQ--EESLVEIPFETVTSPDANLAEGQTRIVTAGVKGQRRLVTKVSMVNG 309
++ Q R+ +E+ + T T+ A + Q + + V
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS-------ETKETQTTETKETATVEK 1108

Query: 310 QEVREVVEDQVVQNP--VSQVIAVGTKKE-VQPAPTPTPQAEPTHQVAKGTQEEGKTGQ- 365
+E +V ++ + P SQV + E VQP P + +PT + + + T
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 366 ---AITQPTLPEAPVETKGTQEEGKAGQAITQPTLPDAPVEVKGTQEEGKAGQALTQPEL 422
A + E PV T G + + T P + E + + +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT-TQPTVNSESSNKPKNRHRRSV 1227

Query: 423 PEAPIEVKGTQEEGKAGQALVQEQLPEYKVTEGTLVETSTTDLDYKTETTEDPTKYTDEE 482
P V+ + L T L + + +++ +
Sbjct: 1228 RSVPHNVEPATTSSNDRSTVALCDLTS-TNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 483 TVVRNGEKGSQVTKTTYKTVEGVKTDQVLSTSTEVTKEPVNQQVSRGTK 531
+ G+ V+ T+ + S+ + T+ +Q +S +
Sbjct: 1287 EMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISNNVQ 1335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1478PF03309353e-04 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 35.1 bits (81), Expect = 3e-04
Identities = 25/126 (19%), Positives = 46/126 (36%), Gaps = 14/126 (11%)

Query: 5 IIGIDLGGTSIKFAILTTAGEIQE---KWSIKTNILDEGSHIVDDMIESIQHRLDLLGVA 61
++ ID+ T +++ +G+ + +W I+T D++ +I L+G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTA----DELALTI---DGLIGDD 54

Query: 62 AADFQGIGMGSPGVVDREKGTVIGAYNLNWKTLQPIKEKIEKALGIPFFIDNDANVAALG 121
A G S V V W + + + GIP +DN V A
Sbjct: 55 AERLTGASGLS--TVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGA-- 110

Query: 122 ERWMGA 127
+R +
Sbjct: 111 DRIVNC 116


40smi_1362smi_1355N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_1362-291.449979platelet activating factor
smi_1361-2101.977891hypothetical protein
smi_1359-1122.846260hypothetical protein
smi_1358-1122.825177cI-like repressor, S. pneumoniae bacteriophage
smi_1357-1132.900305hypothetical protein
smi_1356-3152.732511transposase orfB, ISSmu1
smi_1355-3151.999537transposase orfA, ISSmu1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1362MALTOSEBP280.048 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.2 bits (62), Expect = 0.048
Identities = 14/37 (37%), Positives = 22/37 (59%)

Query: 132 EIGYYSIKDVKLDESGSSATVTFTSKKLHSKGLASST 168
E G Y IKDV +D +G+ A +TF + +K + + T
Sbjct: 198 ENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADT 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1359ARGREPRESSOR364e-05 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 36.0 bits (83), Expect = 4e-05
Identities = 22/98 (22%), Positives = 45/98 (45%), Gaps = 12/98 (12%)

Query: 1 MLKTERKQLILEELNQHHVVSLEKLVNLLE-----TSESTVRRDLDELEAENKLRRVHG- 54
M K +R I E + + + + ++LV++L+ +++TV RD+ EL +V
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHL----VKVPTN 56

Query: 55 GAELPHSLQEEETIQ--EKSVKNLQEKKLLAQKAASLI 90
+SL ++ K ++L + + A+ LI
Sbjct: 57 NGSYKYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLI 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1358LCRVANTIGEN300.016 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 29.7 bits (66), Expect = 0.016
Identities = 21/78 (26%), Positives = 31/78 (39%)

Query: 80 FVQVAEDTRINVKIKADQETEINGTGPTVEPAQLEELKAILSSLTAEDTVVFAGSSAKNL 139
VQ+ +D I++ IK D + V +E LK IL+ ED ++ G L
Sbjct: 35 LVQLVKDKNIDISIKYDPRKDSEVFANRVITDDIELLKKILAYFLPEDAILKGGHYDNQL 94

Query: 140 GNVIYKDLIALTRQTGAQ 157
N I + L Q
Sbjct: 95 QNGIKRVKEFLESSPNTQ 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1356TYPE3IMSPROT340.002 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 34.0 bits (78), Expect = 0.002
Identities = 13/70 (18%), Positives = 28/70 (40%), Gaps = 1/70 (1%)

Query: 37 LIFAAFKLGAAGITLYNLIRLLVGSLAYLAIFGILLYLFFFKWIRKQEGLL-SGFFTIFA 95
+ + K+ I ++ +I+ + +L L GI I +Q ++ + F + +
Sbjct: 140 FLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVIS 199

Query: 96 GLLLIFEAYL 105
FE Y
Sbjct: 200 IADYAFEYYQ 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_1355IGASERPTASE300.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.003
Identities = 20/76 (26%), Positives = 30/76 (39%)

Query: 65 VKEIQKESPKENTSPTKETNTSQEKAQQEETPKASVKEEKKEEQKASTLDSSTPAPTPSK 124
KE Q KE + KE E + +E PK + + K+EQ + + PA
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 125 PAADNEKQSNNTPTSE 140
E QS T++
Sbjct: 1152 TVNIKEPQSQTNTTAD 1167


41smi_0980smi_0966N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_09803180.645930T-box leader
smi_09793180.838604transposase, ISSmi1
smi_0978-3172.035185transposase, ISSmi1
smi_0977-2192.097453phosphoenolpyruvate carboxylase
smi_0976-1192.283967cell division protein FtsW
smi_09750202.282753peptidylprolyl isomerase
smi_09740202.363103O-methyltransferase
smi_0973-1182.452202oligoendopeptidase F
smi_0972-1172.185182competence protein CoiA
smi_09710140.959929hypothetical protein
smi_09700130.786918tellurite resistance protein TehB
smi_09692141.072247tmRNA(SsrA)-binding protein
smi_09681151.950573exoribonuclease R
smi_09671161.950087membrane protein involved in protein export
smi_09661122.07099350S ribosomal protein L33
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0980DPTHRIATOXIN300.019 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 29.7 bits (66), Expect = 0.019
Identities = 18/49 (36%), Positives = 26/49 (53%), Gaps = 4/49 (8%)

Query: 59 NESPEHLTNKEVLYQWLKKETEVQLEHP-LPELKQIAD---VFVNGNLA 103
+ESP ++E Q+L++ + LEHP L ELK + VF N A
Sbjct: 263 SESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAGANYA 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0979FLAGELLIN340.003 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.2 bits (78), Expect = 0.003
Identities = 57/378 (15%), Positives = 97/378 (25%), Gaps = 20/378 (5%)

Query: 693 DAAKQAAQDAATKANQAIDAATDNA--GVATAQT-----DGIAAIEAVTPTVAVKAAA-- 743
DAA QA + T + + A+ NA G++ AQT + I ++V+A
Sbjct: 43 DAAGQAIANRFTSNIKGLTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGT 102

Query: 744 ---------KAEVAKKLAEKLTALEGTPNATKEEKDAAKQAAQDAATKANQAIDAATDNA 794
+ E+ ++L E T + Q + I
Sbjct: 103 NSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKI 162

Query: 795 GVATAQTDGIAAIEAVTPTV-AVKAAAKAEVAKKLAEKLTALEGTPNATKEEKDAAKQAA 853
V + DG TV +K++ K +
Sbjct: 163 DVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPT 222

Query: 854 QDAATKANQAIDAATDNAGVATAQTDGIAAIEAVTPTVAVKAAAKAEVAKKLAEKLTALE 913
N A T + D ++ T KA A A K +
Sbjct: 223 VPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKG 282

Query: 914 GTPNATKEEKDAAKQAAQDAATKANQAIDAATDNAGVATAQTDGIAAIEAVTPTVAVKAA 973
T + + + A AG A + + + V +V
Sbjct: 283 VTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQF 342

Query: 974 AKAEVAKKLAEKLTALEGTPNATKEEKDAAKQAAQDAATKA-NQAIDAATDNAGVATAQT 1032
+ K + KL+ LE E K A A + T +
Sbjct: 343 TFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGV 402

Query: 1033 DGIAAIEAVTPTVAVKAA 1050
+ +A +
Sbjct: 403 STLINEDAAAAKKSTANP 420


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0975SECGEXPORT303e-04 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 29.9 bits (67), Expect = 3e-04
Identities = 22/78 (28%), Positives = 40/78 (51%), Gaps = 5/78 (6%)

Query: 1 MYNLLLTILLVLSVVIVIAIFMQPTK--NQSSNVFDASSGDLFERSKARGFEAVMQRLTG 58
MY LL + L++++ +V I +Q K + ++ +S LF S + F M R+T
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNF---MTRMTA 57

Query: 59 ILVFFWLAIALALTVLSS 76
+L + I+L L ++S
Sbjct: 58 LLATLFFIISLVLGNINS 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0973TCRTETA1082e-28 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 108 bits (272), Expect = 2e-28
Identities = 70/357 (19%), Positives = 144/357 (40%), Gaps = 9/357 (2%)

Query: 10 LRIAWFGNFLTGASISLVVPFMPIFVENLGVGSEQVAFYAGLAISVSAISAALFSPIWGI 69
L + L I L++P +P + +L V S V + G+ +++ A+ +P+ G
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDL-VHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 70 LADKYGRKPMMIRAGLAMTITMGGLAFVPNIYWLIFLRLLNGVFAGFVPNATALIASQVP 129
L+D++GR+P+++ + + +A P ++ L R++ G+ A A IA
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITD 125

Query: 130 KEKSGSVLGTLSTGVVAGTLTGPFIGGFIAELFGIRTVFLLVGSLLFLAAVLTILFIKEN 189
++ G +S G + GP +GG + F F +L L + + E+
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 190 FQPVAKEKAIPTKELFTSVKYPYL---LVNLFLTSFVIQFSAQSIGPILALYVRDLGQTE 246
+ + S ++ + L F++Q Q + ++ D +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 247 NLLFVSGLIVSSMG-FSSMMSAGVMGKLGDKVGNHRLLVVAQFYSVIIYLLCANASSPLQ 305
G+ +++ G S+ A + G + ++G R L++ Y+L A A+
Sbjct: 245 ATTI--GISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM 302

Query: 306 LGLYRFLFGLGTGALIPGVNALLSKMTPKAGISRIFAFNQVFFYLGGVVGPMAGSAV 362
L G G +P + A+LS+ + ++ L +VGP+ +A+
Sbjct: 303 AFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358



Score = 58.7 bits (142), Expect = 1e-11
Identities = 44/178 (24%), Positives = 76/178 (42%), Gaps = 2/178 (1%)

Query: 214 LVNLFLTSFVIQFSAQSIGPILALYVRDLGQTENLLFVSGLIVSSMGFSSMMSAGVMGKL 273
L+ + T + I P+L +RDL + ++ G++++ A V+G L
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 274 GDKVGNHRLLVVAQFYSVIIYLLCANASSPLQLGLYRFLFGLGTGALIPGVNALLSKMTP 333
D+ G +L+V+ + + Y + A A L + R + G+ TGA A ++ +T
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITD 125

Query: 334 KAGISRIFAFNQVFFYLGGVVGPMAGSAVAGQFGYHAVFYATSLCVAFSCLFNLLQFR 391
+R F F F G V GP+ G + G F HA F+A + + L
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0970TCRTETOQM361e-04 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 36.4 bits (84), Expect = 1e-04
Identities = 42/204 (20%), Positives = 79/204 (38%), Gaps = 28/204 (13%)

Query: 3 FKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKAQTTRNKIMGIYTTDKEQIVFIDTPG 62
+ SG + LG + G + N ++ ++ I T+ + + ++ IDTPG
Sbjct: 25 YNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS-------FQWENTKVNIIDTPG 77

Query: 63 IHKPKTALGDFMVESAYSTLREVDTVLFMVPADEARGKGDDMIIERLKAAKVPVILVVNK 122
H DF+ E Y +L +D + ++ A + ++ L+ +P I +NK
Sbjct: 78 -HM------DFLAE-VYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFINK 129

Query: 123 IDKVHPDQLLSQIDDFRNQMDFKEIVPISALQGNNVSRLVDILSENLDEGFQYFPADQIT 182
ID+ + ID D KE + + V ++ N E Q+ D +
Sbjct: 130 IDQ-------NGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQW---DTVI 179

Query: 183 DHPERFLVSEMIREKVLHLTREEI 206
+ + L M + L E+
Sbjct: 180 EGNDDLLEKYMSGK---SLEALEL 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0967FbpA_PF058336850.0 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 685 bits (1769), Expect = 0.0
Identities = 195/577 (33%), Positives = 321/577 (55%), Gaps = 31/577 (5%)

Query: 1 MSFDGFFLHHMVEELRRELVNGRIQKINQPFEQELVLQIRSNRQSHRLLLSAHPVFGRIQ 60
M+ DG FL+ +++EL+ ++NG+I K+NQP + E++L IR R S +LL+S+ + RI
Sbjct: 1 MALDGIFLYSIIDELKNTIINGKIDKVNQPEKDEIILNIRKGRLSFKLLISSSSNYPRIH 60

Query: 61 LTQTTFENPAQPSTFIMVLRKYLQGALIESIEQVENDRIVEMTVSNKNEIGDHIQATLII 120
LT T NP + F MVLRKY+ A I I Q+ DRIV + + +E+G + +LII
Sbjct: 61 LTDLTKPNPIKAPMFCMVLRKYISNAKIVDIHQINQDRIVVIDFESTDELGFNSIYSLII 120

Query: 121 EIMGKHSNILLVDKSSHKILEVIKHVGFSQNSYRTLLPGSTYIAPPSTESLNPFTIKDEK 180
EIMG+HSN+ L+ K + I++ IKH+ N+YR++ PG Y+ PP + LNPF +
Sbjct: 121 EIMGRHSNMTLIRKRDNIIMDSIKHITPDINTYRSIYPGIEYVYPPKSPKLNPFDFSYDM 180

Query: 181 LFEILQ--TQETTAKNLQSLFQGLGRDTANELENILVSDKL---------------STFR 223
+ + + + +F G+ + ++E+ L ++ + F+
Sbjct: 181 IENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVEVCKDLFK 240

Query: 224 NFFNQETKPCLTETSFSPVPFA--------NQVGEPFASLSNLLDTYYKDKAERDRVKQQ 275
+ + + + S V F + + S S LL+ +Y K + DR+K +
Sbjct: 241 EIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDKSDRLKSK 300

Query: 276 ASELIRRVENELQKNRHKLKKQEKELLATDNAEEFRQKGELLTTFLHQVPNDQDQVILDN 335
+S+L + V N + + K K L ++ + F+ GELLT ++ + + L N
Sbjct: 301 SSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALKKGLSHIELAN 360

Query: 336 YYTNL--PITIALNKALTPNQNAQRYFKRYQKLKEAVKYLTDLIEETKVTILYLESVETV 393
YY+ + I L++ TP+QN Q Y+K+Y KLK++ + + + + + + YL SV T
Sbjct: 361 YYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTN 420

Query: 394 LNQA-GLEEIAEIREELIQTGFIRRRQ--REKIQKRKKPEQYLASDGKTIIYVGRNNLQN 450
+N A +EI EI++ELI+TG+I+ ++ + K K KP +++ DG IYVG+NN+QN
Sbjct: 421 INNADNYDEIEEIKKELIETGYIKFKKIYKSKKSKTSKPMHFISKDGID-IYVGKNNIQN 479

Query: 451 EELTFKMARKEELWFHAKDIPGSHVVISGNLDPSDEVKTDAAELAAYFSQGRLSNLVQVD 510
+ LT K A K ++WFH K+IPGSHV++ +D + +AA LAAY+S+ + S+ V VD
Sbjct: 480 DYLTLKFANKHDIWFHTKNIPGSHVIVKNIMDIPESTLLEAANLAAYYSKSQNSSNVPVD 539

Query: 511 MIEVKKLNKPTGGKPGFVTYTGQKTLRVTPDPKKISS 547
EVK + KP G KPG V Y+ +T+ VTP + +
Sbjct: 540 YTEVKNVKKPNGAKPGMVIYSTNQTIYVTPTNPNLKN 576


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0966FLGFLGJ300.035 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 29.7 bits (66), Expect = 0.035
Identities = 24/123 (19%), Positives = 48/123 (39%), Gaps = 27/123 (21%)

Query: 469 LLAHSALESDWGRSKIAKDK----NNFFGI----------TAYDTTPYLSA--------- 505
+LA +ALES WG+ +I ++ N FG+ T TT Y +
Sbjct: 174 ILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKF 233

Query: 506 KTFDDVDKGILGATKWIKENYIDRGRTFLGNKASGM----NVEYASDPYWGEKIASVMMK 561
+ + + + + N T + G + YA+DP++ K+ +++ +
Sbjct: 234 RVYSSYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQ 293

Query: 562 INE 564
+
Sbjct: 294 MKS 296


42smi_0852smi_0845N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0852-111-1.543063DNA gyrase, B subunit
smi_0851-113-1.349179phosphatase
smi_0850013-0.212237transposase, ISSmi1
smi_0849324-2.7137204-methyl-5(b-hydroxyethyl)-thiazole
smi_0848023-3.621852rod shape determining protein RodA
smi_0847012-0.066503ATP-dependent helicase DinG
smi_08460111.128540protease
smi_0845-1101.163135hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0852PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 3e-04
Identities = 13/76 (17%), Positives = 30/76 (39%), Gaps = 9/76 (11%)

Query: 314 FRFENRIHRTIVTDQLLLKQL---MTI--LFDNAVKY----TEEDGEIDFLISATDRNLY 364
+FE+R+ + ++ M + L +N +K+ + G+I + + +
Sbjct: 234 IQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVT 293

Query: 365 LLVSDNGVGISTEDKK 380
L V + G K+
Sbjct: 294 LEVENTGSLALKNTKE 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0851HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 34/118 (28%), Positives = 55/118 (46%), Gaps = 1/118 (0%)

Query: 2 IKILLVEDDLGLSNSVFDFLDD-FADVMQVFDGEEGLYEAESGVYDLILLDLMLPEKNGF 60
IL+ +DD + + L DV + +G DL++ D+++P++N F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 61 QVLKELREKGITTPVLIMTAKESLDDKGHGFELGADDYLTKPFYLEELKMRIQALLKR 118
+L +++ PVL+M+A+ + E GA DYL KPF L EL I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0849UREASE250.043 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 25.5 bits (56), Expect = 0.043
Identities = 10/26 (38%), Positives = 13/26 (50%)

Query: 13 INPSIGDEIDAWAFGVEPDLLADLVL 38
INP+I + +E ADLVL
Sbjct: 411 INPAIAHGLSHEIGSLEVGKRADLVL 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0845DHBDHDRGNASE922e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 91.7 bits (227), Expect = 2e-24
Identities = 63/252 (25%), Positives = 105/252 (41%), Gaps = 24/252 (9%)

Query: 3 RRVLITGVSSGIGLAQARLFLEKSYQVYGVDQGENPLL-----EGDFHFLQRDLTLDL-- 55
+ ITG + GIG A AR + + VD L D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 56 -EPIFDWCPR-------VDVLCNTAGVLDDYKPLLEQTAQEIQEIFEINYMTPVELTRYY 107
I + R +D+L N AGVL + + +E + F +N +R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 108 LTQMLENKKGTIINMCSIASSLAGGGGHAYTSSKHALAGFTKQLAIDYAEAGIQIFGIAP 167
M++ + G+I+ + S + + AY SSK A FTK L ++ AE I+ ++P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 168 GAVKTAMT--------AADFEPGGLADWVASETPIKRWIEPEEVAEVSLFLASGKVSAMQ 219
G+ +T M A+ G + + P+K+ +P ++A+ LFL SG+ +
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 220 GQILTIDGGWSL 231
L +DGG +L
Sbjct: 248 MHNLCVDGGATL 259


43smi_0698smi_0688N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0698-114-2.631693hypothetical protein
smi_0697-211-1.023950hypothetical protein
smi_0696-110-0.742847cell wall-associated serine proteinase
smi_06951141.466468*ABC transporter permease
smi_06942151.162095hypothetical protein
smi_06931120.745138ABC transporter ATP-binding protein
smi_06920161.420734fructose-2,6-bisphosphatase
smi_06910121.346673hypothetical protein
smi_06900131.583141plasmid stabilization system toxin protein,
smi_0688-2142.477834hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0698TOXICSSTOXIN250.035 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 25.4 bits (55), Expect = 0.035
Identities = 6/25 (24%), Positives = 13/25 (52%)

Query: 45 LLNLDIKVRRLLVKNYSVFYRFDKD 69
+ LD ++R L + + ++ DK
Sbjct: 166 ISTLDFEIRHQLTQIHGLYRSSDKT 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0694PF06580310.009 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.009
Identities = 22/102 (21%), Positives = 43/102 (42%), Gaps = 15/102 (14%)

Query: 288 ILSLSSV--QELRDDREEIDLLQMTQSLVKDYTLLAKKRELQIDNSLTYQ----QAYLNP 341
+ SLS + LR L ++V Y LA + ++ L ++ A ++
Sbjct: 197 LTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQ---FEDRLQFENQINPAIMDV 253

Query: 342 SVMKLILSNLISNAIKHSVL----GGLVRIG--EREGELFIE 377
V +++ L+ N IKH + GG + + + G + +E
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0693HTHFIS861e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 1e-21
Identities = 29/104 (27%), Positives = 51/104 (49%), Gaps = 1/104 (0%)

Query: 2 KILIVEDEEMIREGVSDYLTDCGYETIEAADGQEALEKFSSYEVALVLLDIQMPKLNGLE 61
IL+ +D+ IR ++ L+ GY+ ++ ++ + LV+ D+ MP N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLAEIRKT-SQVPVLMLTAFQDEEYKMSAFASLADGYLEKPFSL 104
+L I+K +PVL+++A + A A YL KPF L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDL 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0691PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 18/41 (43%), Positives = 23/41 (56%), Gaps = 4/41 (9%)

Query: 28 FEPG-KF-YSII--GESGAGKSTLLSLLAGLDSPVEGSILF 64
EPG KF YS++ G G GKSTL++ L GLD +
Sbjct: 589 MEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0688ISCHRISMTASE514e-10 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 51.2 bits (122), Expect = 4e-10
Identities = 27/88 (30%), Positives = 41/88 (46%), Gaps = 1/88 (1%)

Query: 103 KRHYSAFSGTDLDIRLRERRVSTVILTGVLTDICVLHTAIDAYNLGYDIEIVKPAVASIW 162
K YSAF T+L +R+ +I+TG+ I L TA +A+ V AVA
Sbjct: 123 KWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFS 182

Query: 163 PENHQFALDHFKNTLGA-KLVDENLNEI 189
E HQ AL++ + D L+++
Sbjct: 183 LEKHQMALEYAAGRCAFTVMTDSLLDQL 210


44smi_0637smi_0631N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_0637016-0.463957iron-dependent transcriptional regulator
smi_0636-2161.772436hypothetical protein
smi_06350161.929102D-tyrosyl-tRNA(Tyr) deacylase
smi_0634-2121.923938GTP pyrophosphokinase, stringent response
smi_0633-1142.378034hypothetical protein
smi_0632-2102.145923metallo-beta-lactamase superfamily protein
smi_0631-3141.795543peptidase M13 superfamily
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0637TCRTETA385e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.3 bits (89), Expect = 5e-05
Identities = 53/371 (14%), Positives = 131/371 (35%), Gaps = 13/371 (3%)

Query: 2 NRNYKILWLTFLVSNFGDWLRKLALPLLVFEKTGS---PFHMATLYGISFLPWILFSLVG 58
NR ++ T + G L LP L+ + S H L + L + V
Sbjct: 4 NRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 59 GILADKFKKVYIISICHLISLVILSLLIISFKNDNVLLLLIYILTFLLSSTEPLVHPAFQ 118
G L+D+F + ++ L+SL ++ L +L + +++
Sbjct: 64 GALSDRFGRRPVL----LVSLAGAAVDYAIMATAPFLWVL--YIGRIVAGITGATGAVAG 117

Query: 119 SLLPQIVTDNQLSKANSGIQLIDNTLNLIGPMISGSVLLLINPAKVLWVNALTFLIASVL 178
+ + I ++ ++ + + GP++ G ++ +P + A + +
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVL-GGLMGGFSPHAPFFAAAALNGLNFLT 176

Query: 179 ILCIKTPENTLVNISKKESLSKTISIGLHYVKNDKIILSGAILFFGTNFAIHIFQANLVY 238
C PE+ + + + ++ + +FF + A V
Sbjct: 177 G-CFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 239 YITDVLGYTSFHYGLILSIAGV-GAILGAILAPELIKKFRYGKILSISTMLAGLSTMLLS 297
+ D + + G+ L+ G+ ++ A++ + + + L + M+A + +L
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG-MIADGTGYILL 294

Query: 298 INTNYIYMGVFLGLSNMFGNINAITYFTLRQKVVKKEMLGRVVSITRMISFASIPLGAYL 357
+M + + G I + + V +E G++ ++ + +G L
Sbjct: 295 AFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLL 354

Query: 358 GGILVSHNLTI 368
+ + ++T
Sbjct: 355 FTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0634ADHESNFAMILY442e-160 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 442 bits (1138), Expect = e-160
Identities = 296/309 (95%), Positives = 302/309 (97%)

Query: 1 MKKLGTLFVLFLSVIALVACASGKKDAASGHKLKVVATNSIIADITKNIAGDKIDLHSIV 60
MKKLGTL VLFLS I LVACASGKKD SG KLKVVATNSIIADITKNIAGDKIDLHSIV
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 61 PVGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFSKLVENAKKTENKDYFAVSE 120
P+GQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWF+KLVENAKKTENKDYFAVS+
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDYFAVSD 120

Query: 121 GVEVIYLEGKNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKL 180
GV+VIYLEG+NEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKL
Sbjct: 121 GVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKL 180

Query: 181 DKLDKESKDKFNNIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLV 240
DKLDKESKDKFN IPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLV
Sbjct: 181 DKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLV 240

Query: 241 EKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQGKEGDSYYNMMKYNL 300
EKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQGKEGDSYY+MMKYNL
Sbjct: 241 EKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQGKEGDSYYSMMKYNL 300

Query: 301 DKIAEGLAK 309
DKIAEGLAK
Sbjct: 301 DKIAEGLAK 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0632RTXTOXIND482e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 2e-07
Identities = 23/137 (16%), Positives = 53/137 (38%), Gaps = 13/137 (9%)

Query: 198 SQFDQKVYNIARLKYQDLAGLNAFSAAYEEKSKQHQEDLE--QALSDNGK-ARLQLLKKE 254
+Q QK N+ + + + + A YE S+ + L+ +L A+ +L++E
Sbjct: 200 NQKYQKELNLDKKR-AERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQE 258

Query: 255 GQESLDKGQETLDKSETNLQEGKRRLAAAQARIQVQESQLDLLPQAQREQASAQLTQAKQ 314
+ ++ L+ K +L ++ I + + L+ Q + + +L Q
Sbjct: 259 NK---------YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTD 309

Query: 315 ELSKEEDRLKQAEQNLA 331
+ L + E+
Sbjct: 310 NIGLLTLELAKNEERQQ 326



Score = 36.0 bits (83), Expect = 7e-04
Identities = 22/129 (17%), Positives = 44/129 (34%), Gaps = 2/129 (1%)

Query: 231 QHQEDLEQALSDNGKARLQLLKKEG-QESLDKGQETLDKSETNLQEGKRRLAAAQARIQV 289
+ D + S +ARL+ + + S++ + K +
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190

Query: 290 QESQLDLLPQAQREQASAQLTQAKQELSKEEDRLKQAEQNLAQEKEKLEKHQRVLDDLAE 349
+ Q Q Q+ Q L + + E R+ + E EK +L+ +L A
Sbjct: 191 IKEQFSTW-QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAI 249

Query: 350 PKYQVYNRQ 358
K+ V ++
Sbjct: 250 AKHAVLEQE 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0631PF05272300.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.006
Identities = 12/27 (44%), Positives = 18/27 (66%), Gaps = 1/27 (3%)

Query: 32 KGELVIIL-GASGAGKSTVLNLLGGMD 57
K + ++L G G GKST++N L G+D
Sbjct: 594 KFDYSVVLEGTGGIGKSTLINTLVGLD 620


45smi_0263smi_0256N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_02634253.001735mannose-specific phosphotransferase system
smi_0262-1182.692360hypothetical protein
smi_0261-2202.388817hypothetical protein
smi_0260-1242.292489transposase, ISSmi5
smi_0259-1180.855891aminopeptidase C
smi_02580231.88643716S rRNA pseudouridine(516) synthase
smi_02570251.463561aminopeptidase
smi_0256-1190.388995hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0263TCRTETOQM6200.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 620 bits (1600), Expect = 0.0
Identities = 181/667 (27%), Positives = 296/667 (44%), Gaps = 57/667 (8%)

Query: 9 KTRNIGIMAHVDAGKTTTTERILYYTGKIHKIGETHEGASQMDWMEQEQERGITITSAAT 68
K NIG++AHVDAGKTT TE +LY +G I ++G +G ++ D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAQWNNHRVNIIDTPGHVDFTIEVQRSLRVLDGAVTVLDSQSGVEPQTETVWRQATEYGV 128
+ QW N +VNIIDTPGH+DF EV RSL VLDGA+ ++ ++ GV+ QT ++ + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFANKMDKIGADFLYSVSTLHDRLQANAHPIQLPIGSEDDFRGIIDLIKMKAEIYTN 188
P I F NK+D+ G D + ++L A +IK K E+Y N
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEI------------------VIKQKVELYPN 163

Query: 189 DLGTDILEEDIPAEYLDQAQEYREKLVEAVAETDEELMMKYLEGEEITNEELKAGIRKAT 248
T+ E + + V E +++L+ KY+ G+ + EL+
Sbjct: 164 MCVTNFTESE---------------QWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRF 208

Query: 249 INVEFFPVLCGSAFKNKGVQLMLDAVIDYLPSPLDIPAIKGINPDTDEEETRPASDEEPF 308
N FPV GSA N G+ +++ + + S +
Sbjct: 209 HNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSEL 249

Query: 309 AALAFKIMTDPFVGRLTFFRVYSGVLQSGSYVLNTSKGKRERIGRILQMHANSRQEIDTV 368
FKI RL + R+YSGVL V + K K +I + +ID
Sbjct: 250 CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKA 308

Query: 369 YSGDIAAAVGLKDTTTGDSLTDEKAKIILESINVPEPVIQLMVEPKSKADQDKMGIALQK 428
YSG+I + L D K E I P P++Q VEP ++ + AL +
Sbjct: 309 YSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLE 367

Query: 429 LAEEDPTFRVETNVETGETVISGMGELHLDVLVDRMRREFKVEANVGAPQVSYRETFRAS 488
+++ DP R + T E ++S +G++ ++V ++ ++ VE + P V Y E R
Sbjct: 368 ISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME--RPL 425

Query: 489 TQARGFFKRQSGGKGQFGDVWIEFTPNEEGKGFEFENAIVGGVVPREFIPAVEKGLVESM 548
+A + + + + +P G G ++E+++ G + + F AV +G+
Sbjct: 426 KKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGIRYGC 485

Query: 549 ANGVLAGYPMVDVKAKLYDGSYHDVDSSETAFKIAASLALKEAAKSAQPAILEPMMLVTI 608
G L G+ + D K G Y+ S+ F++ A + L++ K A +LEP + I
Sbjct: 486 EQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYLSFKI 544

Query: 609 TVPEENLGDVMGHVTARRGRVDGMEAHGNSQIVRAYVPLAEMFGYATVLRSASQGRGTFM 668
P+E L + + N I+ +P + Y + L + GR +
Sbjct: 545 YAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGRSVCL 604

Query: 669 MVFDHYE 675
Y
Sbjct: 605 TELKGYH 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0259HTHFIS732e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 2e-17
Identities = 24/122 (19%), Positives = 51/122 (41%), Gaps = 2/122 (1%)

Query: 2 KLLVAEDQSMLRDAMCQLLTLQPDVESVFQAKNGQEAIQLLEKESVNIAILDVEMPVKTG 61
+LVA+D + +R + Q L+ V N + + ++ + DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 LEVLEWIRAENLETKVVVVTTFKRPGYFERAVKAGVDAYVLKERNIADLMQTLHTVLEGR 121
++L I+ + V+V++ +A + G Y+ K ++ +L+ + L
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 122 KE 123
K
Sbjct: 123 KR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0258PF06580409e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 9e-06
Identities = 66/376 (17%), Positives = 119/376 (31%), Gaps = 67/376 (17%)

Query: 1 MLERLKSIHYMFWASLIFMLFPILPVVTGWLSAWHLLIDILFVVAYLGVLTTKSQHLSWL 60
L L M+F I + G + AY + + WL
Sbjct: 24 TLTGFGFASLYGSPKLHSMIFNIAISLMGLV----------LTHAYRSFI----KRQGWL 69

Query: 61 YWGLMLTYVVGNTAFVAVNYIWFFFFLSNLLSYHFSVRSLKSLHVWTFLLAQVLVVGQLL 120
+ + A V + +WF S F + + L VV
Sbjct: 70 KLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTF 129

Query: 121 IFQRIEVESLFYLLVILAFVDLMTFGMVRIRIVEDLKEAQAKQNAQINLLLAENERSRIG 180
++ + F+ + ++ K A Q AQ+ L +++I
Sbjct: 130 MWSLLYFGWHFFK-------------NYKQAEIDQWKMASMAQEAQLMAL-----KAQIN 171

Query: 181 QDLHDSLGHTFAMLSVKTDLALQLFQMEAYPQVEKELKEIHQISKDSMNEVRTIVENLKS 240
+ + + +E + + L + ++ + S+ +
Sbjct: 172 PHF---MFNALNNIRALI--------LEDPTKAREMLTSLSELMRYSLRYSNA-----RQ 215

Query: 241 RTLASELETVKKMLEIAGI----EVQVENQLDKASLTQDVESTAAMVLLELATNIIKHAR 296
+LA EL V L++A I +Q ENQ++ A + V M++ L N IKH
Sbjct: 216 VSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPP---MLVQTLVENGIKHGI 272

Query: 297 AEKA-----YLKLERTDQELVLTVRDDGKGFA----TVKGNELHTVRDRAATFSG---QV 344
A+ LK + + + L V + G G L VR+R G Q+
Sbjct: 273 AQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQI 332

Query: 345 ELVSLKDPTEVRVHLP 360
+L + V +P
Sbjct: 333 KLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0256PF05272290.025 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.025
Identities = 11/32 (34%), Positives = 16/32 (50%)

Query: 31 CVALIGPNGAGKTTLLDCLLGDKRITSGQVSI 62
V L G G GK+TL++ L+G + I
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


46smi_0073smi_0064N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_00730225.963823PTS system, IID component
smi_00721215.343425PTS system, IIC component
smi_00710194.075012phosphotransferase system sugar-specific EIIB
smi_0070-1194.300556beta-galactosidase 3
smi_0069-2214.028032hypothetical protein
smi_0068-3203.438264transposase, IS1167
smi_0067-3192.679636adenylosuccinate lyase
smi_0066-1250.9400585-(carboxyamino)imidazole ribonucleotide
smi_00650261.2711345-(carboxyamino)imidazole ribonucleotide mutase
smi_00642270.033474phosphoribosylglycinamide synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0073ARGDEIMINASE300.023 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 29.8 bits (67), Expect = 0.023
Identities = 16/94 (17%), Positives = 35/94 (37%), Gaps = 14/94 (14%)

Query: 146 DGLALGKGVVVAETVEQAVEAAHEMLLDNKFGDSGA--RVVIEEF--------LDGEEFS 195
D L L KG++V E+ + E L + F + + ++ + LD +
Sbjct: 220 DELVLNKGLLVIGISERTEAKSVEKLAISLFKNKTSFDTILAFQIPKNRSYMHLD----T 275

Query: 196 LFAFVNGDKFYIMPTAQDHKRAYDGNKGPNTGGM 229
+F ++ F + + Y P++ +
Sbjct: 276 VFTQIDYSVFTSFTSDDMYFSIYVLTYNPSSSKI 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0069BINARYTOXINA300.012 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 30.4 bits (68), Expect = 0.012
Identities = 11/33 (33%), Positives = 15/33 (45%)

Query: 193 YSLVRRVFADYTGEEVLPELEGKQLKEVLLEPT 225
YS R+ F DY E E E K L+ + +
Sbjct: 93 YSQTRQYFYDYQIESNPREKEYKNLRNAISKNK 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0067FLGMRINGFLIF320.014 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 32.2 bits (73), Expect = 0.014
Identities = 29/127 (22%), Positives = 41/127 (32%), Gaps = 35/127 (27%)

Query: 1100 ASSTSPTLFYNDANQHVAKMVETRITNTNSPWLAGVQVGDIHAIPVSHGEGKFV--VTAE 1157
S + NDA A VE+RI L+ + G G VTA+
Sbjct: 220 TQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPIV-----------GNGNVHAQVTAQ 268

Query: 1158 EFAELRDNGQIFSQYVDFDGKPSMDSKYNPNGSVHAIEGITSKNGQIIGKMGHSERYEDG 1217
+DF K + Y+PNG SK ++ SE+ G
Sbjct: 269 ---------------LDFANKEQTEEHYSPNGDA-------SKATLRSRQLNISEQVGAG 306

Query: 1218 LFQNIPG 1224
+PG
Sbjct: 307 YPGGVPG 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0066RTXTOXINA300.011 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.3 bits (68), Expect = 0.011
Identities = 19/81 (23%), Positives = 32/81 (39%), Gaps = 4/81 (4%)

Query: 172 SIVEFYYKNDDLDDPFINDEHVKFLQIADDQQISYLKEETRRINE----LLKAWFAEIGL 227
I + K D L I+ V F + +D + + I + WF +
Sbjct: 861 IIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESG 920

Query: 228 KLIDFKLEFGFDKDGKIILAD 248
+ + ++E FDK G+II D
Sbjct: 921 DISNHEIEQIFDKSGRIITPD 941


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0065RTXTOXIND673e-14 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 67.2 bits (164), Expect = 3e-14
Identities = 78/465 (16%), Positives = 150/465 (32%), Gaps = 54/465 (11%)

Query: 4 EFLESA-EFYNRRYHNFSSRVIVPMSLLLVFLLGFATFAEKEMSLSTRATVEPSRILANI 62
EFL + E V + LV + + E+ + + S I
Sbjct: 40 EFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEI 99

Query: 63 QSTSN---NRILVNHLEENKLVKKGELLLQYQEGAEGVQAEAYASQLDMLKDQKKQLEYL 119
+ N I+V +E + V+KG++LL+ A G +A+ +Q +L+ + +Q Y
Sbjct: 100 KPIENSIVKEIIV---KEGESVRKGDVLLKLT--ALGAEADTLKTQSSLLQARLEQTRYQ 154

Query: 120 QKSLQEGTDYFPEEDKFGYQATFRDYISQAGSLRASTSQQNETIASQNAAASQTQAEIGN 179
S + PE D + I Q + + +
Sbjct: 155 ILSRSIELNKLPE-------LKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKEL 207

Query: 180 LISQAEAKIRDYQTAKSAIETGASLGNQNLAYSLYQSYKSQGEENPQAKAQAVAQVEAQL 239
+ + A+ + E + + L + S + AV + E +
Sbjct: 208 NLDKKRAERLTVLARINRYENLSRVEKSRL--DDFSSLLHKQ----AIAKHAVLEQENKY 261

Query: 240 SQLESSLATYRVQYAGSGSQQAYASGLSSQL-ESLKSQHLAKVGQELTLLDQKILEVESG 298
+ + L Y+ Q S+ A + + K++ L K+ Q + LE+
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKN 321

Query: 299 KKVQGNLLDKGKITASEDGVLHLNPETSDSSMVAEGALLAQLYPS---LEKEGKAKLTAY 355
++ Q + I A + ++ +V L + P LE
Sbjct: 322 EERQQASV----IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTAL------ 371

Query: 356 LSSKDVARIKVGDSVR-------YTTTHDAKNQLFLDSTITSIDATATKTEKGNFF---- 404
+ +KD+ I VG + YT L + +I+ A + ++
Sbjct: 372 VQNKDIGFINVGQNAIIKVEAFPYTRYGY------LVGKVKNINLDAIEDQRLGLVFNVI 425

Query: 405 -KIEAETNLTSEQAEKLRYGVEGRLQMITGKKSYLRYYLDQFLNK 448
IE T + L G+ ++ TG +S + Y L
Sbjct: 426 ISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEES 470


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0064ANTHRAXTOXNA300.042 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.1 bits (67), Expect = 0.042
Identities = 34/209 (16%), Positives = 76/209 (36%), Gaps = 26/209 (12%)

Query: 306 NLFFMTLLALPIYTVIIFAFMKPFEKMNRDTMEANAVLSSSIIEDINGIETIKSLTSESQ 365
N F ++ ++V++FA + +E NA+ DI + +E +
Sbjct: 4 NKFIPNKFSIISFSVLLFAIS------SSQAIEVNAMNEHYTESDIKRNHKTEKNKTEKE 57

Query: 366 RYQKIDKEFVDYLKKSFTYSRAESQQKALKKVAHLLLNVGILWMGAVLVMDGKMSLGQLI 425
+++ V + T + + Q LKK+ +L + G + D +
Sbjct: 58 KFKDSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDID------L 111

Query: 426 TYNTLLVYFTNPLENIINLQTKLQTAQVANNRLNEVYLVASEFEEKKTV---EDLSLMKG 482
+ L + +N +N + + ++ + E K + +D ++
Sbjct: 112 VEHKELQDLSEEEKNSMNSRGE-------KVPFASRFVFEKKRETPKLIINIKDYAINSE 164

Query: 483 EMTFKQVSYKYGYG--RDVLSDINLTIPQ 509
+ K+V Y+ G G D++S P+
Sbjct: 165 QS--KEVYYEIGKGISLDIISKDKSLDPE 191


47smi_0038smi_0031N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
smi_00383190.488298hypothetical protein
smi_00370120.768431transposase, ISSpnII
smi_0036-2162.170512transposase, ISSpnII
smi_0034-2152.033314hypothetical protein
smi_0033-2192.913574hypothetical protein
smi_0032-2202.455321transposase, ISSmi1
smi_0031-2201.983152transposase, ISSmi4
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0038PF01540394e-05 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 38.9 bits (90), Expect = 4e-05
Identities = 71/271 (26%), Positives = 110/271 (40%), Gaps = 45/271 (16%)

Query: 44 NYDKALEEAVKALEAAKTNSPEEAKAKAD---TIAALKAASDATRKEIVEGFKKGLTVEA 100
+Y K LE K + A T S +EA + D I+ L AA + + E
Sbjct: 60 DYSKILETLNKEI-AEATKSFKEAGSYGDYPAIISKLSAAVENAKSE------------- 105

Query: 101 AEKLINQANEKAAEEDKAAAAKEKAAQEEFLANYDQALKDA----IAELEKADTKNDPEL 156
++ ++QAN+K A+E+ KE A + L+ Q+ D I +LE + D
Sbjct: 106 -QQKVDQANKKIADEN--LKIKEGAKELLKLSEKIQSFADTIALTITKLEGKKFQIDETF 162

Query: 157 EKQKKETVAALKAASVQTRS-EIVEGFKQGLTAKQAEGLID------EANKKAAEEDKEA 209
+KQ T+ L S + ++ V K+ + E + E EE K+A
Sbjct: 163 KKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKKA 222

Query: 210 AAKE----EAAQEEFLANYDAALKAAIAELEKAETNSPEEAKAKADTIAALKAASDKTRA 265
+KE +A ++ LA + +K EL K A A TI L + +
Sbjct: 223 WSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFADTIALTITKL-----ERKF 277

Query: 266 EIVEGFKQGLTAKQAEKFIDELNKKAAEEDK 296
+I E FK KQ I+ LNKK+ E
Sbjct: 278 QIDEKFK-----KQLISTIELLNKKSVEVKT 303



Score = 30.9 bits (69), Expect = 0.013
Identities = 51/180 (28%), Positives = 73/180 (40%), Gaps = 35/180 (19%)

Query: 123 EKAAQEEFLANYDQALK--DAIAELEKADTKNDPELEKQKKETVAALK----AASVQTRS 176
+K A++ D ALK +A+AE K + LE KE A K A S
Sbjct: 30 DKLAEKNGKEKADAALKQANALAEELKKNPDYSKILETLNKEIAEATKSFKEAGSYGDYP 89

Query: 177 EIVEGFKQGL-TAKQAEGLIDEANKKAAEEDKEAAAKEEAAQEEFLANYDAALKAAIAEL 235
I+ + AK + +D+ANKK A+E+ +K EL
Sbjct: 90 AIISKLSAAVENAKSEQQKVDQANKKIADEN-------------------LKIKEGAKEL 130

Query: 236 EKAETNSPEEAKAKADTIAALKAASDKTRAEIVEGFKQGLTAKQAEKFIDELNKKAAEED 295
K E+ ++ ADTIA + + +I E FK KQ I+ LNKK+AE
Sbjct: 131 LKLS----EKIQSFADTIALTITKLEGKKFQIDETFK-----KQLISTIELLNKKSAEVK 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0037CHANLCOLICIN290.038 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 29.3 bits (65), Expect = 0.038
Identities = 46/256 (17%), Positives = 91/256 (35%), Gaps = 34/256 (13%)

Query: 41 KAKADAQAVAQHIYDVQDQQL--------SILTPAYNLYKEKDAAYNTL-----SARVKE 87
KAKA+ A+ Q + D+ ++ L S A+ A L + ++
Sbjct: 82 KAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARK 141

Query: 88 AYSNLEKLQQEKIRTADLIGDGTALVRKADELKSARTEAAQREAKFTDDVESYENEVNTA 147
EK QE + I A +LK A E +R A +++ ++ E
Sbjct: 142 EAEAAEKAFQEAEQRRKEIEREKA--ETERQLKLAEAE-EKRLAALSEEAKAVEIAQKKL 198

Query: 148 KQARVDAVKVTAAAEAAFQAALTSGDDTAIAQTRSNLALAKGRQNAAVKAEKEANEKLAK 207
A+ + VK+ + + L+S R A +K +LA+
Sbjct: 199 SAAQSEVVKMDGEIKTL-NSRLSS--------------SIHARD-AEMKTLAGKRNELAQ 242

Query: 208 AEAELQKAVFDQARIEAALQYMDALETGNDYKITQAAYNLDKKIENAKETISTLESEVEV 267
A A+ ++ D+ + + + D L+ ++ T+ K E ++ ++ E+ +
Sbjct: 243 ASAKYKE--LDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINR 300

Query: 268 ARTDRDNALLALENAE 283
D A+
Sbjct: 301 INADITQIQKAISQVS 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0036FLGFLIH314e-04 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 30.9 bits (69), Expect = 4e-04
Identities = 15/35 (42%), Positives = 25/35 (71%)

Query: 22 QAEEKGLERGLERGRAEGIEKGREEGIEEGLKVGL 56
QA E+G + G+ GR +G ++G +EG+ +GL+ GL
Sbjct: 50 QAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGL 84



Score = 28.6 bits (63), Expect = 0.002
Identities = 15/44 (34%), Positives = 26/44 (59%)

Query: 9 EEQALLAQDYALEQAEEKGLERGLERGRAEGIEKGREEGIEEGL 52
E+Q Q A EQ + G+ G ++G +G ++G +G+E+GL
Sbjct: 41 EQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGL 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0034FLGFLIH405e-06 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 40.2 bits (93), Expect = 5e-06
Identities = 28/83 (33%), Positives = 45/83 (54%), Gaps = 8/83 (9%)

Query: 215 LDYKSWSEED----RKMFSQLRMREEQALLAQDYALETARAE----GIEQGLERGLERGR 266
L +K+W+ +D + F + EE + + +LE A+ EQG + G+ GR
Sbjct: 5 LPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGR 64

Query: 267 AEGIEKGREEGLEQGLERGKAEG 289
+G ++G +EGL QGLE+G AE
Sbjct: 65 QQGHKQGYQEGLAQGLEQGLAEA 87



Score = 34.0 bits (77), Expect = 5e-04
Identities = 14/40 (35%), Positives = 25/40 (62%)

Query: 249 ARAEGIEQGLERGLERGRAEGIEKGREEGLEQGLERGKAE 288
A +G + G+ G ++G +G ++G +GLEQGL K++
Sbjct: 51 AHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQ 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
smi_0031FLGMOTORFLIM280.034 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 28.3 bits (63), Expect = 0.034
Identities = 18/79 (22%), Positives = 35/79 (44%), Gaps = 7/79 (8%)

Query: 119 EEPLKLLPLVFVLALIPVRKLKSLFLLGIASGLGFQMIEDI-GYIRTDLPEGFDFT---- 173
EE ++ +P LA+I + LK +L + + F +I+ + G D T
Sbjct: 91 EEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSITFSIIDRLFGGTGQAAKVQRDLTDIEN 150

Query: 174 --ISRILERIISGIASHWT 190
+ ++ RI++ + WT
Sbjct: 151 SVMEGVIVRILANVRESWT 169



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.