Sanity check: The below bit shows what happens on an xfs file system based RedHat 7.3 machine with a kernel upgrade. One of the odd issues I found in RH7.2 days was ext2 and ext3 had troubles with > 2GB files on RH. This often was a mixture of several different problems. RH7.3 is a "modern" distro in that things work right with mostly up-to-date tools. Punchline: no problems on this system. Note: strace output shows wget using fstat64 and related functions. One thing that confused me during my debugging times in the past has been if any of the libraries or pipelined code snippets had been compiled w/o the large file support, it would be the weak link in the chain. Also note: for both raw performance reasons, and for sanity reasons, I try to use XFS where I can. On large file sequential reads it is hard to beat. Add to that the fsck times on ext2 (or ext3 w/o data=journal option on mount) are painful on small systems. I do not (and will not) use ReiserFS anymore for anything. I only use ext3 when raw performance does not matter as much as maintaining the ability to use the next RH kernel. Life is made harder by RH not including XFS (yet). This will change with 2.6 kernels. synopsis: Looks like you need either a newer wget, or newer glibc (or one of the other libraries that RH7.2 presents to wget on RLD). --- target --- [landman@squash.scalableinformatics.com:~] 5 >wget -v http://127.0.0.1/db/nt.Z --17:28:31-- http://127.0.0.1/db/nt.Z => `nt.Z' Connecting to 127.0.0.1:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2,112,977,407 [text/plain] 100%[=================================================>] 2,112,977,407 34.23M/s ETA 00:00 17:29:30 (34.23 MB/s) - `nt.Z' saved [2112977407/2112977407] [landman@squash.scalableinformatics.com:~] 6 >md5sum nt.Z b88443e2fb32bd3b593fa39000e7e18a nt.Z [landman@squash.scalableinformatics.com:~] 7 >wget -O - http://127.0.0.1/db/nt.Z | md5sum - --17:33:48-- http://127.0.0.1/db/nt.Z => `-' Connecting to 127.0.0.1:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2,112,977,407 [text/plain] 100%[=================================================>] 2,112,977,407 21.42M/s ETA 00:00 17:35:22 (21.42 MB/s) - `-' saved [2112977407/2112977407] b88443e2fb32bd3b593fa39000e7e18a - [landman@squash.scalableinformatics.com:~] 8 >uname -a Linux squash.scalableinformatics.com 2.4.20-rc4 #1 Tue Nov 26 23:38:45 EST 2002 i686 unknown [landman@squash.scalableinformatics.com:~] 9 >rpm -qa | grep -i wget wget-1.8.2-4.73 --- source --- [root@squash db]# md5sum nt.Z b88443e2fb32bd3b593fa39000e7e18a nt.Z --- strace of wget --- execve("/usr/bin/wget", ["wget", "-v", "http://127.0.0.1/db/nt.Z"], [/* 46 vars */]) = 0 uname({sys="Linux", node="squash.scalableinformatics.com", ...}) = 0 brk(0) = 0x80725f0 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/X11R6/lib/i686/mmx/libssl.so.2", O_RDONLY) = -1 ENOENT (No su [...] select(4, NULL, [3], [3], {900, 0}) = 1 (out [3], left {900, 0}) write(3, "GET /db/nt.Z HTTP/1.0\r\nUser-Agen"..., 103) = 103 write(2, "HTTP request sent, awaiting resp"..., 40HTTP request sent, awaiting response... ) = 40 select(4, [3], NULL, [3], {900, 0}) = 1 (in [3], left {900, 0}) read(3, "HTTP/1.1 200 OK\r\nDate: Thu, 30 J"..., 4096) = 4096 write(2, "200 OK", 6200 OK) = 6 write(2, "\n", 1 ) = 1 write(2, "Length: ", 8Length: ) = 8 write(2, "2,112,977,407", 132,112,977,407) = 13 write(2, " [text/plain]\n", 14 [text/plain] ) = 14 open("nt.Z.2", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 fstat64(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000 chris dagdigian wrote: > Hi folks, > > I thought these problems were long past me with modern kernels and > filesystems -- > > We as a community have learned to deal with uncompressed sequence > databases that are greater than 2gb -- its pretty simple to gzcat the > file and pipe it through formatdb via STDIN to avoid having to > uncompress the database file at all. > > Now however I've got a problem that the compressed archive file that > someone is trying to download is greater than 2gb in size :) > > The database in question is: > > ftp://ftp.ncbi.nlm.nih.gov/blast/db/FormattedDatabases/htgs.tar.gz > > The file is mirrored via 'wget' and a cron script and has recently > started core dumping. A ftp session for this file also seemed to bomb > out but I have not verified this fully. > > I did the usual things that one does; verified that the wget binary core > dumps regardless of what shell one is using (Joe Landman found this > issue a while ago...). I also verified that the error occurs when > downloading to a NFS mounted NetApp filesystem as well as a local ext3 > formatted filesystem. The node is running Redhat 7.2 with a 2.4.18-18.7 > kernel. > > Next step was to recompile 'wget' from the source tarball with the usual > "-D_ENABLE_64_BIT_OFFSET" and "-D_LARGE_FILES" compiler directives. > > Still no love. The wget binary still fails once the downloaded file gets > a little larger than 2gb in size. > > Anyone seen this before? What FTP or HTTP download clients are people > using to download large files? > > -Chris > > > > -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615