[Bioclusters] Request for discussions-How to build a biocluster Part 3 (the OS)

13 May 2002 22:17:48 -0500

Donald Becker <becker@scyld.com> writes:
> Not at all.
> You selection will depend on the type of cluster you want, and how much
> effort and expertise you will need to build and maintain the cluster.
> 
> You can build an first generation Beowulf cluster by putting together a
> collection of independent machine running your favorite distribution.
> That will require loading the distribution on each machine, and careful
> control of system and application updates.  It's suitable for one person
> being the combined builder, administrator and user.
> 
> The Scyld distribution includes features such as unified process space,
> single point/ single binary application installation, cluster directory
> services and zero-administration compute nodes.  These features require
> kernel and library support.

I think we're using the word 'distribution' in slightly different ways.  When
I think of my favored distribution, Debian Linux, I think primarily of the
packaging tools and format (apt-get, dpkg, .deb files) and the philosophical
and engineering structures within which the packages get created and
maintained.  Red Hat, another popular Linux distribution, has analogs which
are quite different.

Beowulf software, though it may require special versions of libraries and
patched kernels, is, from the distribution perspective, just a set of
packages.  As far as I can see, the pieces could be packaged for the different
distributions with little difficulty.  Considered this way, the Beowulf
software seems almost completely orthogonal to the choice of distribution, and
therefore should not determine its choice.

Of course, some situations are indifferent to distribution, and may value a
turn-key solution, but that's not my situation.

It does seem that the only reasonable way to structure a cluster of any size
is for the compute nodes to somehow mirror some canonical copy of their
filesystems.  (No one wants to do any manual operation O(N) times.)

The two obvious ways to accomplish this mirroring would be to use nfsroot and
run the compute nodes diskless (or OS-diskless, anyway), or otherwise to use
something like SystemImager to clone the systems at boot time.  I can see
advantages to both, but currently I'm partial to the nfsroot alternative.  It
just seems simpler, and these days I worship simplicity.  :-)

The Bproc stuff looks interesting.  I worry, though, about the amount of brain
surgery being done to the Linux/Unix model (which goes back to that simplicity
thing).

> Doing diskless boots over NFS works for a handful of nodes, but is a
> significant performance and scalability bottleneck.  There are ways to
> mitigate the performance problems with tuning, but you have to measure and
> monitor the system to verify that it's performing as you expect.

I suspect there are a lot of little issues here to bump into.  I hadn't
considered that NFS attribute stuff you pointed out.  (Does Scyld do technical
consulting as well?)

As before, since I have yet to build my first Beowulf, take all this with an
extra large grain of salt.

Mike