[Bioclusters] Re: Help on BLAST
Chris Dwan (CCGB)
bioclusters@bioinformatics.org
Mon, 26 Aug 2002 08:45:03 -0500 (CDT)
Wim,
If you assume that the two blast target sets are non-overlapping, the
only scores which need to be recalculated are the e- and p- values.
Score and Bit Score are based soley on the alignment, substitution
matrix, and gap costs, plus the K and lambda parameters. Those don't
change with target set size.
e-value = { m n 2^(bit_score) }
m and n are the number of residues in the target and query set. To
recompute an e-value, given n-old (the original target set size) and
n-new (the new TOTAL target set size):
new-e-value = n-new * (old-e-value / n-old)
Adding this line to whatever code you're using is left as an exercise
for the reader.
-C
Wim Glassee writes:
> Hi,
>
> I had a fast look at the sources for seqsplit and blastunsplit, and
> there doesn't seem to be any statistics recalculation of any kind in
> there. If you blast smaller pieces of a query sequence against a db, the
> statistics will not be the same as for the original blast, so when
> merging the output files, you won't end up with the same results. In a
> lot of cases even the number of hits and/or hsps will NOT be the same.
>
> Wim
>
>
>
> > -----Original Message-----
> > From: bioclusters-admin@bioinformatics.org [mailto:bioclusters-
> > admin@bioinformatics.org] On Behalf Of Mario Belluardo
> > Sent: maandag 26 augustus 2002 15:04
> > To: bioclusters@bioinformatics.org
> > Subject: [Bioclusters] Re: Help on BLAST
> >
> > Hi Sylvain,
> > I've found and testing seqsplit (and blastunsplit) that you can
> download
> > form here
> >
> > ftp://ftp.cgr.ki.se/pub/prog/MSPcrunch+Blixem/
> >
> > Here is the web documentation:
> > http://www.cgr.ki.se/cgr/groups/sonnhammer/MSPcrunch.html
> >
> > Unfortunately seems it works only with a single-sequence at time, it
> > means that you cannot submit multi-sequences querys, but you can
> modify
> > yourself the source code. I would like to do it, so if you modify it
> > before me let me know!
> >
> > Mario
> >
> >
> >
> > > Message: 2
> > > Date: Fri, 23 Aug 2002 14:51:14 -0400
> > > From: Sylvain Foisy <sylvain.foisy@bioneq.qc.ca>
> > > To: bioclusters@bioinformatics.org
> > > Subject: [Bioclusters] Re: Help on BLAST
> > > Reply-To: bioclusters@bioinformatics.org
> > >
> > > Hi
> > >
> > > On Friday, August 23, 2002, at 12:01 PM, bioclusters-
> > > request@bioinformatics.org wrote:
> > >
> > > > I read your posts saying "splitting the query sequence into small
> =
> > > > fragments and BLASTing each of those fragments against the
> (entire) =
> > > > database is super-easy to implement." Could you please tell me how
> to
> > =
> > > > combine the results, or a link to the solution would be very
> helpful?
> > >
> > > Add me to the list of interested parties to that subject. I would
> like
> > > to know how to write an app that would do these three steps:
> > >
> > > -Splitting a sequence in multiples of, let say, 100 nucleotides;
> > > -Send each of them to a node for BLASTing;
> > > -Reassemble the different results into a single report for the
> users.
> > >
> > > Any web links that would help us in our quest?
> > >
> > > Cordially
> > >
> > > Sylvain
> > >
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > Sylvain Foisy, Ph. D.
> > > Directeur-Operations / Project Manager
> > > BioNEQ - Le Reseau quebecois de bioinformatique
> > > Genome-Quebec
> > > Tel.: (514) 878-9911
> > > E-mail: sylvain.foisy@bioneq.qc.ca
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> > --
> >
> > Dr. Mario Belluardo
> > Institute for Cancer Research and Treatment
> > http://www.ircc.it
> > _______________________________________________
> > Bioclusters maillist - Bioclusters@bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
> _______________________________________________
> Bioclusters maillist - Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>