[Bioclusters] High Availability Clustering

Osborne, John bioclusters@bioinformatics.org
Tue, 22 Jul 2003 14:58:54 -0400


I'm the unofficial admin for a 20 node (40 CPU) linux cluster here at the
CDC and I'm looking for some advice.  Our setup here relies upon a *single*
master node which acts as a gateway to the internal cluster network.  If
something were to happen to the master node, we'd be in serious trouble if
we are aiming for 100% uptime.  So far we aren't that serious about 100%
uptime (although we've had it for this master node thus far) but as the
popularity of the cluster grows it is becoming more important.  I am
wondering what is the best way to ensure failover for a master node in a
cluster.  Write now I just write out a master node image to network storage
every night and if something goes wrong, the cluster is effectively down and
it could take hours to get it fixed.

Is it possible to have 2 master nodes with a single virtual IP address?  How
are other people solving this problem?