[Bioclusters] breaking up NCBI databases

Lucas Carey bioclusters@bioinformatics.org
Sun, 29 Jun 2003 23:36:21 -0400


On Thu, May 01, 2003 at 05:29:52PM -0500, Jeremy Mann wrote:
> I am curious if any knows of any commercial or open source solution to
> breaking up the NCBI dbs into various sizes. Here, our present solution is

> segments for how many nodes you specify. Now this database is only useful
> when using mpiBLAST. I want to try and use one version of the database for
> all immplementations (we also use wwwblast and command line tools). And
 
The fragments created by mpiBLAST are fully compatable with the standard command line tools. If you have fragments nt.00 -> nt.99 just run blastall -d nt and it will read each database segment. It SHOULD work with tripple digit fragments as well if you use the blastall built from the patched ncbi toolbox. Beware that the ncbi toolbox does not correctlly check dependancies, so you have to 'make clean' after patching the toolbox for mpiBLAST or else your binaries won't be rebuilt. You can also remove the binaries manually to avoid rebuilding everything. 

As for the mpi in mpiformatdb, the first version of the program did indeed use MPI. 

-Lucas