[Bioclusters] NCBI updates and how you do them

Jeremy Mann jeremy at biochem.uthscsa.edu
Tue Apr 26 09:37:18 EDT 2005

I'm curious how the community updates their NCBI databases each day or if
you do it weekly.

Here, we have an rsync script that updates from NCBI then "pushes" to all
of our nodes. We recently received a new cluster with dual gigabit and
channel bonding. When this script runs, the load level on all "pushed"
nodes increases to a point where the node is completely unresponsive. With
top, I've seen loads as high as 9.56 when that node is rsyncing.

Yesterday, I experimented with different ethernet drivers, tg3 and
bcm5700. From my experiments, the bcm5700 driver is a tick faster and the
load level drops to about an average of 2 during the push. I still see
segments that go up to 4 for a few minutes, then drop to 2.

My questions are:

Is rsync the way to push to all nodes? If not, what other alternatives exist?
Does everybody else see the load level increase when pushing to all nodes?

Jeremy Mann
jeremy at biochem.uthscsa.edu

University of Texas Health Science Center
Bioinformatics Core Facility
Phone: (210) 567-2672

