<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns:o = "urn:schemas-microsoft-com:office:office"><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY>
<DIV><FONT face=Arial size=2>
<DIV><FONT face=Arial size=2>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>Hi,</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>I have a bioinformatics project that involves finding polymorphisms in
mitochondrial DNA (mtDNA).<SPAN style="mso-spacerun: yes"> </SPAN>The
polymorphisms are typically denoted as "reference base/position/polymorphic
base", as in A750G.<SPAN style="mso-spacerun: yes"> </SPAN>I'd like to add
a software tool to our company website where a visitor could paste in a set of
mitochondrial genomes, and a reference sequence, and get back a list of
polymorphisms.<SPAN style="mso-spacerun: yes"> </SPAN>Something
like:</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>>Seq1</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>A458G, T4899A....</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>>SEQ2</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>T678C, G6789C....</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman">etc.<SPAN style="mso-spacerun: yes">
</SPAN></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>We sequence mitochondrial DNA for customers interested in learning about
their ancient ancestry.</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> </FONT></FONT><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT
face="Times New Roman"><FONT size=3>The site will be freely available.<SPAN
class=562320221-08042006> It will be attached to our company site, <A
href="http://www.argusbio.com/">www.argusbio.com</A>, which is still in
development at LunarPages. The author's name and an email link
could be listed on the page.</SPAN></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>A full-length genome is 16,569 bases long.<SPAN
style="mso-spacerun: yes"> </SPAN>Typically two people will have around 30
to 50 differences in their mtDNAs - more (but less than 100) if they have very
different ancestry (African vs European, for example).<SPAN
style="mso-spacerun: yes"> </SPAN>These polymorphisms determine the
person’s mitochondrial haplogroup.</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>It would be very helpful if the program were able to determine which
haplogroup the mtDNA belongs in based on the list of polymorphisms.<SPAN
style="mso-spacerun: yes"> </SPAN>I have tables of diagnostic
polymorphisms used for classing mt genomes.</FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>It would also be very useful if there were an option to generate a fasta
file that consisted of just polymorphic sites.<SPAN
style="mso-spacerun: yes"> </SPAN>So if someone put in 100 full-length
genomes, and a reference genome, the output would be fasta sequences where each
base varied from the reference in at least one test sequence.<SPAN
style="mso-spacerun: yes"> </SPAN>This output would be much easier to
align with CLUSTALW than the full-length sequences, which are typically > 99%
invariant. </FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman">I am looking for some ideas of how best to implement this
web-based tool.<SPAN style="mso-spacerun: yes"> </SPAN></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT size=3><FONT
face="Times New Roman"> <o:p></o:p></FONT></FONT></P>
<P class=MsoNormal style="MARGIN: 0mm 0mm 0pt"><FONT face="Times New Roman"
size=3>Thanks,</FONT></P><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman'; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"><SPAN
style="mso-spacerun: yes"></SPAN></SPAN></FONT></DIV>
<P>David B. Whyte, Ph.D.<BR>Argus Biosciences, LLC<BR>650-954-1055</P>
<P><SPAN class=562320221-08042006></SPAN><A
href="mailto:dwhyte@argusbio.com">d<SPAN
class=562320221-08042006>whyte@argusbio.com</A></SPAN><BR><A
href="http://www.argusbio.com/">www.argusbio.com</A><BR> </P></FONT></DIV>
<DIV> </DIV></BODY></HTML>