How is it that your results are not right? Do you mean to say that you have two databases. A) a single FASTA for nt of roughly 6 gig formatted with formatdb (no -v param) and B) the same fasta file formatted with formatdb -v to split it into several pieces. And using the same query sequence you get different results with A and B? ----- Original Message ----- From: "Sergio Ahumada N" <san@inf.utfsm.cl> To: <bioclusters@bioinformatics.org> Sent: Monday, January 27, 2003 9:42 AM Subject: Re: [Bioclusters] Details on a local blast cluster question > > ... and also for the record on our RedHat 7.2 based system (kernel > > 2.4.2-2smp?), files greater than 2GB have to be piped into formatdb, rather > > than supplied as an argument > > I wrote and send to this list a Perl script for cutting a large size database > into pieces .. I test with a "nt" (~ 6GB) and I can supplied it as an > argument. It's not a great code, but it's works :) > > Furthermore, Tim Harsch says that splitting a database reduce the costs of > phisical memory ... I think it is not good at all .. because the results are > not right ... we are test a local blast cluster (9 DELL Power Edge) and the > best performance (reducing costs of time and disk usage) is obtined when > split the input files (fasta format obtained via phredPhrap) in equal size of > the physical memory available and send separate jobs to each node in the > cluster ... I hope you find this useful (for beginners i guess) > > Greetings > > > Andy > > ps: I am so sorry for my bad english :/ > -- > Sergio Antonio Ahumada Navea mailto:san@inf.utfsm.cl > Centro de Bioinformatica - UTFSM > http://www.biotec.utfsm.cl/ > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters