[Bioclusters] blast db update

David Adelson bioclusters@bioinformatics.org
Mon, 18 Oct 2004 09:47:35 -0500


We use curl.  Here is one of the scripts we use for a weekly update to 
make blast dbs for iNquiry.

[xblast:/BlastDB] root# more weekly_nuc_curl
cd /Volumes/XBlastRAID/BlastDB/download
curl --disable-epsv -s -S -O -u anonymous:yourusernam@yourdomain.edu 
curl --disable-epsv -s -S -O -u anonymous:xxxxxx@xxxx.edu 
curl --disable-epsv -s -S -O -u anonymous:xxxxxx@xxxx.edu 
gunzip htgs.gz
gunzip gss.gz
gunzip sts.gz
btformatdb -i htgs -p F -s T -o T
btformatdb -i gss -p F -s T -o T
btformatdb -i sts -p F -s T -o T

We schedule these scripts to run using cron, so the downloads and db 
formatting occur in the wee hours.  Note we then have to move the 
formatted dbs from /BlastDB/download to our regular blast db location.

hope this helps.

Dave Adelson

On Oct 14, 2004, at 4:37 PM, Peiran Song wrote:

> Hi,
> This has been a topic before, but I am still in need of suggestions on
> the job that I try to do. I need to build a local Genbank human, mouse
> and zebrafish blast database which is updated fairly frequently if not
> nightly, and be able to run the btblastall from iNquiry software to
> parallel blast job.
> I could think of two ways to get the database, but am troubled with the
> updates on both.
> One is to get the nt database and run blast with gi list of the species
> interested. I will have to get FASTA data from NCBI so that to format 
> it
> in a way that the btblastall could parallel with. But I don't think the
> NCBI site support rsync, ture? Then what are people's solution for
> frequent update? Another problem of this strategy is the gi list also
> has to be updated, I don't have a good idea on that either...
> Another choice is to parse the genbank release to get initial data, and
> use the daily file for updates. But as fmerge is no longer supported, 
> is
> there a good way to do the merge with NCBI db format? (WU BLAST package
> has utility to achieve that.)
> Help me out!
> Thanks,
> Peiran Song
> Zebrafish Information Network
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters