[Bioclusters] Using semi-public PCs for heavy computation jobs

Chris Dwan (CCGB) bioclusters@bioinformatics.org
Wed, 18 Feb 2004 15:32:45 -0600 (CST)

> How do you handle user authentication on each machine or set of machines?

Perceptive question:  That's the really hacky bit.

I have a little database that keeps track of the clusters / resources on
which each user has an account, and their username in that domain.  It is
up to the user to configure their account on each resource so that job
submissions can happen from the host where my metascheduler runs onto the
resources that they want to use.

Example, my accounts might be:
  My systems:             cdwan
  Supercomputing center:  chris_dwan
  Workstation lab:        dwanchristopher

As a user, I have exchanged the appropriate certificates and so forth so
that I can ssh password free into the lab and the supercomputing center.
I have also convinced those admins to allow one machine in my domain to
submit into each of theirs.  That particular machine is where my
metascheduler runs.

In effect, I made my very own namespace, in which users may have accounts
that are tied to specific resources.  The major benefit is that it's no
longer *my* job to do the impossible (convince the supercomputing center
to give accounts to all of my users, and vice versa).  I just do the
minimal legwork to make it a bit easier for users with accounts at both
sites to submit across the administrative boundaries.

> Also, aside from your own queueing system, have you tried running your
> jobs using any of the major ones (PBS, SGE, etc.) with and without globus?

I can submit to PBS, SGE, LSF, and Condor based queuing systems.  I've
tried globus, but it seemed beyond my abilities to make it work across all
of these other domains.  The major stumbling block was that some labs
really didn't want to install additional complex software (like globus),
since their focus was on providing service to their local users.

> Would love to hear more on what you've done, especially how to convince
> the lab managers that giving up their unused CPU cycles is a good idea :-)

My experience  has been that almost every lab support team has at least
one member who is totally into the geeky side of these things.  Once
there's personal trust that neither side is populated by malevolent
imbeciles, the technical aspects become much easier.  Start a monthly
roundtable for IT folks.  Provide pizza.  Pretty soon you'll have more
cycles than you know what to do with.

It's also possible to force the issue by having power users go through
administrative channels and demand that ports be opened and software be
installed.  This makes no friends and raises tensions.  I rarely want that
kind of "help."