[Bioclusters] Details on a local blast cluster question

Tue, 28 Jan 2003 18:22:30 -0800

How is it that your results are not right?  Do you mean to say that you have
two databases.  A) a single FASTA for nt of roughly 6 gig formatted with
formatdb (no -v param) and B) the same fasta file formatted with formatdb -v
to split it into several pieces.  And using the same query sequence you get
different results with A and B?

----- Original Message -----
From: "Sergio Ahumada N" <san@inf.utfsm.cl>
To: <bioclusters@bioinformatics.org>
Sent: Monday, January 27, 2003 9:42 AM
Subject: Re: [Bioclusters] Details on a local blast cluster question

> > ... and also for the record on our RedHat 7.2 based system (kernel
> > 2.4.2-2smp?), files greater than 2GB have to be piped into formatdb,
rather
> > than supplied as an argument
>
> I wrote and send to this list a Perl script for cutting a large size
database
> into pieces .. I test with a "nt" (~ 6GB) and I can supplied it as an
> argument.  It's not a great code, but it's works :)
>
> Furthermore, Tim Harsch says that splitting a database reduce the costs of
> phisical memory ... I think it is not good at all .. because the results
are
> not right ... we are test a local blast cluster (9 DELL Power Edge) and
the
> best performance (reducing costs of time and disk usage) is obtined when
> split the input files (fasta format obtained via phredPhrap) in equal size
of
> the physical memory available and send separate jobs to each node in the
> cluster  ... I hope you find this useful (for beginners i guess)
>
> Greetings
>
> > Andy
>
> ps: I am so sorry for my bad english :/
> --
> Sergio Antonio Ahumada Navea                mailto:san@inf.utfsm.cl
> Centro de Bioinformatica - UTFSM
> http://www.biotec.utfsm.cl/
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters