[Bioclusters] blast db update

Peiran Song bioclusters@bioinformatics.org
Thu, 14 Oct 2004 14:37:03 -0700 (PDT)


This has been a topic before, but I am still in need of suggestions on 
the job that I try to do. I need to build a local Genbank human, mouse 
and zebrafish blast database which is updated fairly frequently if not 
nightly, and be able to run the btblastall from iNquiry software to 
parallel blast job. 

I could think of two ways to get the database, but am troubled with the 
updates on both. 

One is to get the nt database and run blast with gi list of the species 
interested. I will have to get FASTA data from NCBI so that to format it 
in a way that the btblastall could parallel with. But I don't think the 
NCBI site support rsync, ture? Then what are people's solution for 
frequent update? Another problem of this strategy is the gi list also 
has to be updated, I don't have a good idea on that either...

Another choice is to parse the genbank release to get initial data, and 
use the daily file for updates. But as fmerge is no longer supported, is 
there a good way to do the merge with NCBI db format? (WU BLAST package 
has utility to achieve that.) 

Help me out!

Peiran Song

Zebrafish Information Network