[Bioclusters] download blast db with rsync in uncompressed format

elijah wright bioclusters@bioinformatics.org
Mon, 1 Dec 2003 11:14:59 -0600 (CST)


> > in fact rsync can't be used at its "best performances" because the
> > databases are already compressed. Thus the transmitted data to update
> > a local version is very high and could be much lower if using rsync
> > with uncompressed databases (by usind the rsync switch to compress
> > data that is being transmitted). Is there any server on which it would
> > be possible to get such uncompressed files (in fasta or precompressed
> > format) ? I couldn't find any with a google. Or do you know a better
> > way to lower the transmitted data ?

erm, if you're syncing compressed databases against compressed databases,
then rsync's compression should gain you *nothing*.  you just want to be
able to compare the blocks and update the ones that aren't the same.
since they're already compressed, the net amount of data should be LESS,
rather than more...

i am assuming, of course, that you are syncing two compressed versions of
the dataset and not trying to do something truly odd.

[if i remember correctly, rsync copes well with gzip / bzip data because
there are pretty clear block boundaries within which data tends to remain
the same if only some files change...]


elijah