[Bioclusters] blast db update
David Adelson
bioclusters@bioinformatics.org
Mon, 18 Oct 2004 09:47:35 -0500
Peiran,
We use curl. Here is one of the scripts we use for a weekly update to
make blast dbs for iNquiry.
[xblast:/BlastDB] root# more weekly_nuc_curl
cd /Volumes/XBlastRAID/BlastDB/download
curl --disable-epsv -s -S -O -u anonymous:yourusernam@yourdomain.edu
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/htgs.gz
curl --disable-epsv -s -S -O -u anonymous:xxxxxx@xxxx.edu
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/gss.gz
curl --disable-epsv -s -S -O -u anonymous:xxxxxx@xxxx.edu
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/sts.gz
gunzip htgs.gz
gunzip gss.gz
gunzip sts.gz
btformatdb -i htgs -p F -s T -o T
btformatdb -i gss -p F -s T -o T
btformatdb -i sts -p F -s T -o T
We schedule these scripts to run using cron, so the downloads and db
formatting occur in the wee hours. Note we then have to move the
formatted dbs from /BlastDB/download to our regular blast db location.
hope this helps.
Dave Adelson
On Oct 14, 2004, at 4:37 PM, Peiran Song wrote:
> Hi,
>
> This has been a topic before, but I am still in need of suggestions on
> the job that I try to do. I need to build a local Genbank human, mouse
> and zebrafish blast database which is updated fairly frequently if not
> nightly, and be able to run the btblastall from iNquiry software to
> parallel blast job.
>
> I could think of two ways to get the database, but am troubled with the
> updates on both.
>
> One is to get the nt database and run blast with gi list of the species
> interested. I will have to get FASTA data from NCBI so that to format
> it
> in a way that the btblastall could parallel with. But I don't think the
> NCBI site support rsync, ture? Then what are people's solution for
> frequent update? Another problem of this strategy is the gi list also
> has to be updated, I don't have a good idea on that either...
>
> Another choice is to parse the genbank release to get initial data, and
> use the daily file for updates. But as fmerge is no longer supported,
> is
> there a good way to do the merge with NCBI db format? (WU BLAST package
> has utility to achieve that.)
>
> Help me out!
>
> Thanks,
> Peiran Song
>
> Zebrafish Information Network
>
> _______________________________________________
> Bioclusters maillist - Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>