[Bioclusters] what are people using to mirror large ftp
repositories with 2gb+ files?
Joe Landman
bioclusters@bioinformatics.org
Wed, 10 Sep 2003 05:01:01 -0400
Chris:
You also need to look out for tcsh if you are using that. A rebuild
may be needed.
I am using curl in my db_dlaf (database download and format) utility
(http://scalableinformatics.com/downloads/db_dlaf.pl). I have been
bitten by some odd bugs in wget in the past, and now prefer curl.
[landman@squash.scalableinformatics.com:~]
2 >./db_dlaf.pl --help
db_dlaf.pl: copyright 2003 Scalable Informatics LLC
web: http://scalableinformatics.com
email: landman@scalableinformatics.com
Usage:
db_dlaf.pl [--l | --list] [--path=/path] [--db=db1:db2:...] \
[--url={http|ftp}://host/path] [--tmp=/path] \
[--formatdb="options"] [--help]
--l list of files
--list longer list
--path where to put the database indices
--db list of databases to grab, use : to
separate
--url http or ftp path to databases
--tmp temporary disk space
--formatdb formatdb options to use on each db
It is also very much a function of which version of the glibc you are
using (assuming Linux/BSD like OS).
Joe
On Wed, 2003-09-10 at 16:17, Chris Dagdigian wrote:
> Hi folks,
>
> I've turned a bunch of Seagate 160gb IDE disks into a large software
> RAID5 volume and am trying to mirror the raw fasta data from
> ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/ for use on a personal
> development project.
>
> The 'wget' utility (Redhat 9 on a dual PIII system w/ ext3 filesystem)
> is bombing out on a few of the remote files which even when compressed
> are greater than 2gb in size.
>
> My kernel and ext3 filesystem support large filesizes but 'wget' or my
> shell seem to have issues.
>
> I've recompiled the wget .src.rpm with the usual compiler flags to add
> large file support and wget _seems_ to be working but I don't really
> trust it as it is reporting negative filesizes like this now:
>
> > RETR nt.Z...done
> > Length: -1,668,277,957 [-1,705,001,669 to go]
>
> What are others doing? Would 'curl' be better? Any recommendations would
> be appreciated.
>
> -Chris
--
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 612 4615