[BiO BB] Looking for researcher, to assist on blast-like invention
theoriste at gmail.com
Tue Feb 12 11:46:03 EST 2008
By the way, nr is ftp-able from NCBI and is a protein-based database if you
On Feb 12, 2008 11:44 AM, DT <theoriste at gmail.com> wrote:
> On Feb 11, 2008 6:56 PM, Theodore H. Smith <delete at elfdata.com> wrote:
> > On 11 Feb 2008, at 22:28, Ryan Golhar wrote:
> > > Why don't you write up a paper describing the algorithm in detail and
> > > submit it to a bioinformatics journal? And, why not make the
> > > executable
> > > available with documentation so that people can download it and try it
> > > out for themselves.
> > >
> > > Do you have any test cases that show it runs faster/better than BLAST?
> > > Describe them and make them available.
> > The first thing I'd need to do is make a good test. I'm not sure what
> > constitutes "a good test", in this case.
> NR ALL VS ALL: This will test speed and somehow test performance. The nr
> database (non-redundant) from NCBI is a good place to start testing as a
> template database. I'd use your algorithm all-against-all in nr. Test
> against BLAST and then use your algorithm for each entry in nr versus all
> of nr, and then compare performance. You can generate a ROC plot for BLAST
> vs your algorithm against a known set of homologs and distant homologs,
> based on a p-value or significance level cutoff.
> A real randomization test would be this to test sensitivity and
> specificity: take known sequences in nr -- all or some of them -- and
> scramble them by 'homologous recombination" -- create chimeras of known
> sequences by different randomization criteria -- by domain (criteria based
> on domain annotation) or by individual sequence based on a known
> randomization function, and then test the sensitivity and specificity of
> BLAST vs your algorithm to detect the originating sequences that created the
> You will also need to check the performance of your algorithm against
> nucleotide sequences. There are already test cases in BLAST for
> mouse-vs-human, that would be a good test case.
> Deanne Taylor
More information about the BBB