Hi guys, I bumped into the same problem a while ago, and as Jeremy points out ncftp(get) seems to be the only possible alternative at the moment. Wim > -----Original Message----- > From: bioclusters-admin@bioinformatics.org [mailto:bioclusters- > admin@bioinformatics.org] On Behalf Of Joseph Landman > Sent: vrijdag 31 januari 2003 0:04 > To: bioclusters@bioinformatics.org > Subject: Re: [Bioclusters] ack. I'm getting bitten by the 2gb filesize > problem on a linux cluster... >=20 > Sanity check: >=20 > The below bit shows what happens on an xfs file system based RedHat = 7.3 > machine with a kernel upgrade. One of the odd issues I found in RH7.2 > days was ext2 and ext3 had troubles with > 2GB files on RH. This = often > was a mixture of several different problems. RH7.3 is a "modern" = distro > in that things work right with mostly up-to-date tools. >=20 > Punchline: no problems on this system. >=20 > Note: strace output shows wget using fstat64 and related functions. > One thing that confused me during my debugging times in the past has > been if any of the libraries or pipelined code snippets had been > compiled w/o the large file support, it would be the weak link in the > chain. >=20 > Also note: for both raw performance reasons, and for sanity reasons, = I > try to use XFS where I can. On large file sequential reads it is hard > to beat. Add to that the fsck times on ext2 (or ext3 w/o = data=3Djournal > option on mount) are painful on small systems. I do not (and will = not) > use ReiserFS anymore for anything. I only use ext3 when raw = performance > does not matter as much as maintaining the ability to use the next RH > kernel. Life is made harder by RH not including XFS (yet). This will > change with 2.6 kernels. >=20 > synopsis: Looks like you need either a newer wget, or newer glibc (or > one of the other libraries that RH7.2 presents to wget on RLD). >=20 > --- target --- >=20 > [landman@squash.scalableinformatics.com:~] > 5 >wget -v http://127.0.0.1/db/nt.Z > --17:28:31-- http://127.0.0.1/db/nt.Z > =3D> `nt.Z' > Connecting to 127.0.0.1:80... connected. > HTTP request sent, awaiting response... 200 OK > Length: 2,112,977,407 [text/plain] >=20 > = 100%[=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D>] 2,112,977,407 > 34.23M/s ETA 00:00 >=20 > 17:29:30 (34.23 MB/s) - `nt.Z' saved [2112977407/2112977407] >=20 > [landman@squash.scalableinformatics.com:~] > 6 >md5sum nt.Z > b88443e2fb32bd3b593fa39000e7e18a nt.Z >=20 > [landman@squash.scalableinformatics.com:~] > 7 >wget -O - http://127.0.0.1/db/nt.Z | md5sum - > --17:33:48-- http://127.0.0.1/db/nt.Z > =3D> `-' > Connecting to 127.0.0.1:80... connected. > HTTP request sent, awaiting response... 200 OK > Length: 2,112,977,407 [text/plain] >=20 > = 100%[=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D>] 2,112,977,407 > 21.42M/s ETA 00:00 >=20 > 17:35:22 (21.42 MB/s) - `-' saved [2112977407/2112977407] >=20 > b88443e2fb32bd3b593fa39000e7e18a - >=20 >=20 > [landman@squash.scalableinformatics.com:~] > 8 >uname -a > Linux squash.scalableinformatics.com 2.4.20-rc4 #1 Tue Nov 26 23:38:45 > EST 2002 i686 unknown > [landman@squash.scalableinformatics.com:~] > 9 >rpm -qa | grep -i wget > wget-1.8.2-4.73 >=20 > --- source --- > [root@squash db]# md5sum nt.Z > b88443e2fb32bd3b593fa39000e7e18a nt.Z >=20 >=20 > --- strace of wget --- > execve("/usr/bin/wget", ["wget", "-v", "http://127.0.0.1/db/nt.Z"], = [/* > 46 vars */]) =3D 0 > uname({sys=3D"Linux", node=3D"squash.scalableinformatics.com", ...}) = =3D 0 > brk(0) =3D 0x80725f0 > open("/etc/ld.so.preload", O_RDONLY) =3D -1 ENOENT (No such file or > directory) > open("/usr/X11R6/lib/i686/mmx/libssl.so.2", O_RDONLY) =3D -1 ENOENT = (No su >=20 > [...] >=20 > select(4, NULL, [3], [3], {900, 0}) =3D 1 (out [3], left {900, 0}) > write(3, "GET /db/nt.Z HTTP/1.0\r\nUser-Agen"..., 103) =3D 103 > write(2, "HTTP request sent, awaiting resp"..., 40HTTP request sent, > awaiting response... ) =3D 40 > select(4, [3], NULL, [3], {900, 0}) =3D 1 (in [3], left {900, 0}) > read(3, "HTTP/1.1 200 OK\r\nDate: Thu, 30 J"..., 4096) =3D 4096 > write(2, "200 OK", 6200 OK) =3D 6 > write(2, "\n", 1 > ) =3D 1 > write(2, "Length: ", 8Length: ) =3D 8 > write(2, "2,112,977,407", 132,112,977,407) =3D 13 > write(2, " [text/plain]\n", 14 [text/plain] > ) =3D 14 > open("nt.Z.2", O_WRONLY|O_CREAT|O_TRUNC, 0666) =3D 4 > fstat64(4, {st_mode=3DS_IFREG|0664, st_size=3D0, ...}) =3D 0 > old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, 0) =3D 0x40014000 >=20 > chris dagdigian wrote: > > Hi folks, > > > > I thought these problems were long past me with modern kernels and > > filesystems -- > > > > We as a community have learned to deal with uncompressed sequence > > databases that are greater than 2gb -- its pretty simple to gzcat = the > > file and pipe it through formatdb via STDIN to avoid having to > > uncompress the database file at all. > > > > Now however I've got a problem that the compressed archive file that > > someone is trying to download is greater than 2gb in size :) > > > > The database in question is: > > > > ftp://ftp.ncbi.nlm.nih.gov/blast/db/FormattedDatabases/htgs.tar.gz > > > > The file is mirrored via 'wget' and a cron script and has recently > > started core dumping. A ftp session for this file also seemed to = bomb > > out but I have not verified this fully. > > > > I did the usual things that one does; verified that the wget binary = core > > dumps regardless of what shell one is using (Joe Landman found this > > issue a while ago...). I also verified that the error occurs when > > downloading to a NFS mounted NetApp filesystem as well as a local = ext3 > > formatted filesystem. The node is running Redhat 7.2 with a = 2.4.18-18.7 > > kernel. > > > > Next step was to recompile 'wget' from the source tarball with the = usual > > "-D_ENABLE_64_BIT_OFFSET" and "-D_LARGE_FILES" compiler = directives. > > > > Still no love. The wget binary still fails once the downloaded file = gets > > a little larger than 2gb in size. > > > > Anyone seen this before? What FTP or HTTP download clients are = people > > using to download large files? > > > > -Chris > > > > > > > > >=20 > -- > Joseph Landman, Ph.D > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://scalableinformatics.com > phone: +1 734 612 4615 >=20 >=20 > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters