[Bioclusters] problem downloading large files from NCBI

andy law (RI) bioclusters@bioinformatics.org
Fri, 3 Oct 2003 09:58:19 +0100


A word of caution - the NCBI FTP server has a bug in it.

The circumstances under which it appears are limited, but...

... if you try to download a file > 2GB ...
... and it fails after transferring more than 2GB, but not the whole file ...
... and you try to issue a REST command to get it to carry on where it left off ...

... then it tells you it has done so (return code 350) but actually starts transmitting at the 2GB point.

Putting that graphically for the hard of thought.
Assuming that the file is 2.5GB file and that each of the letters a-j each represent 0.25GB of information. The file on the server is

If we transfer 2.25GB then on the client we have

We ask for the rest of the file, starting at the point between 'i' and 'j'.

The server says 'OK' but actually sends us 'ij'

We now have on the client

which is wrong.

I have emailed info@ncbi.nlm.nih.gov. Is there anyone I should mail directly about this?



