[Bioclusters] Re: Rsync and NCBI and bio-mirror.net
Aaron Darling
darling at cs.wisc.edu
Thu Feb 2 04:39:14 EST 2006
I had the pleasure of meeting David Lipman (director of NCBI) at an NIH
conference last summer and suggested to him that NCBI run a bittorent
server. He had just given a talk about NCBI in which he boasted that
they move terabits of data every day, and have the capacity to move
more. What he didn't seem to understand was that it wouldn't matter how
much bandwidth NCBI has if I'm downloading data on the other side of the
planet (or New Mexico in this case) and there's a slow link somewhere in
between us. Both NCBI and Los Alamos have very fat network pipes, but
for some reason I can only download FastA at around 300KB/s to LANL. At
UW-Madison I get 1.5MByte/s.
Anyways, my point is that it would be great if NCBI would set up a
bittorent server themselves, but I wouldn't hold my breath waiting for
their leadership on the issue.
Practically speaking, we'd need a reasonably large number of people
seeding these files for it to work. While I'm not qualified to set up
and administer a bt server, I'm willing to contribute some resources by
running a client. Anybody else?
-Aaron
Steve O wrote:
> Hi,
> After messing around for a while trying to optimize the ncbi
> downloads, I realized independent of the rsync costs that simply
> FTPing the formatted files was the best bet. Even that was a bit
> tricky since the files might change while you're downloading them. I
> no longer have to deal with this problem, but what I wished at the
> time was that some brave soul, perhaps bio-mirror, would offer a bit
> torrent of, say, Monday's snapshot of a file. Then every subsequent
> day, a bit torrent would be offered of the diffs of the unpacked file
> vs. the previous unpacked version. Sites that wished to keep up with
> daily changes could apply the patches, while less critical
> applications could just get the weekly distribution. Running a bit
> torrent client would require server resources, but the actual load on
> your servers should be reduced significantly.
>
> -steve
>
> _______________________________________________
> Bioclusters maillist - Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
More information about the Bioclusters
mailing list