[Bioclusters] Re: Rsync and NCBI and bio-mirror.net

Don Gilbert gilbertd at bio.indiana.edu
Wed Feb 1 14:52:20 EST 2006

How high is demand for mirroring the FASTA/ subfolder of
ftp://ftp.ncbi.nlm.nih.gov/blast/db/ ?   I'll be happy
to consider adding it to bio-mirror.net.   On the other hand,
it is a large data chunk, and will add to network copy load
which now is stretched for the blast-format tar files.
We have almost continuous ftp copying from ncbi:/blast/db/ now
due to the almost daily data turnover, ftp timeouts, and such.

Those who want source data could instead use the
Genbank dataset -> fasta, at a lower cost. E.g. only
a few Genbank/WGS subsets are updated daily, whereas
the whole 18 GB blast wgs.fasta is updated daily.

Rsync is a nice tool, but has a much higher server side
CPU cost than FTP. Those of you running into rsync errors at NCBI 
would probably have better luck using an FTP mirroring

- Don Gilbert
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- gilbertd at indiana.edu--http://marmot.bio.indiana.edu/

More information about the Bioclusters mailing list