[Bioclusters] Cluster ideas/suggestions

Chris Dwan cdwan at bioteam.net
Thu Jan 13 10:29:11 EST 2005

In the workstation labs I mentioned before, a different group was 
responsible for providing the desktop environment, and they were 
(rightfully) sensitive to adding anything to that environment which 
might impinge on the user experience.  After a couple of false starts 
with crashes and some user complaints about the time it took to 
checkpoint and regain control of the machine, we went to the rebooting 
model.  It was simpler all around.

Another advantage was that we could have a single head node managing a 
dynamic sized cluster.  I.e:  We provided one cluster, and the 
scheduler knew that some of the nodes were partial duty.   This meant 
that our environment was integrated, rather than having to deal with 
two logical clusters, running different DRMs.

When I have used Condor, I found that it was amazing for some apps, and 
not so great for others.  One thing that they have done a tremendous 
job at is managing the social issues of cycle stealing.  Condor has 
options to support every social or business constraint to getting your 
hands on those idle CPUs.

I would use Condor if:
* I was the admin of the machines in question (no negotiation needed)
* My jobs were amenable to checkpointing (smallish memory footprint, 
clean and simple compile)
* My I/O needs were small
* My code never, ever freaked out and crashed a machine

BLAST is a poor fit on two and three.  Other bioinformatics apps are 
better.  As with most things technical:  "it depends."

I've had good luck on Linux and OS X machines with the part-time hooks 
available in SGE.  It's not hard to arrange that jobs simply not be 
scheduled on those nodes at certain times.

-Chris Dwan
  The BioTeam

On Jan 13, 2005, at 10:03 AM, Arnie Miles wrote:

> Hash: SHA1
> Hi Nick:
> I think what you're looking for is Condor, from the University of
> Wisconsin.  They have software that can harvest Windows machine cycles
> without all the reboots you're describing.  It allows the use of 
> Windows
> computers whenever they're idle, not just overnight, and if Condor is
> using a machine when it's owner wants it, it just checkpoints the job
> and gives the machine back to the owner.
> Arnie
> Joe Landman wrote:
> | Hi Nick:
> |
> | Nick D'Angelo wrote:
> |
> |> In speaking to our R and D managers, they posed a good question.
> |>
> |> What about their desktop machines (relatively New Dell PCs), have 
> them
> |> automatically shutdown and reboot, say at 8:00 pm and then join the
> |> cluster
> |> until 05:00 am.
> |
> |
> | This is called "Beowulf-at-night" or BAN.
> |
> |>
> |> Then at 5:00 am, initiate another reboot that will present the 
> Windows
> |> Login
> |> for the 'normal business day'.
> |
> |
> | Its not "hard" to do, though how do you handle jobs which run longer
> | than usual, or machines that get wedged into some odd state?
> |
> |>
> |> This would make great use of the hundreds of computers that are 
> mostly
> |> sitting idle during the off hours 'normal business day'.
> |
> |
> | You might wish to look at some offerings from Platform Computing.  
> You
> | could tie those in with Cygwin or similar, and have a "unixish" 
> place to
> | run code while retaining the desktop.  You could also look at things
> | like United Devices.  Just beware that there is a bit more of a
> | philosophical buy-in with that approach, and it is a bit more 
> proprietary.
> |
> | Joe
> |
> |>
> |> Thoughts, suggestions always appreciated, thanks.
> |>
> |> Nick
> |>
> |> In case you need to know,  I am likely going to install the BioBrew 
> v3.x
> |> which has just been pre-leased in Beta I believe.
> |> _______________________________________________
> |> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> |> https://bioinformatics.org/mailman/listinfo/bioclusters
> |
> |
> - --
> ==================
> Arnie Miles
> Systems Administrator, Advanced Research Computing (ARC)
> Adjunct Assistant Professor, Computer Science Dept.
> Georgetown University
> 320c St. Mary's
> 3800 Reservoir Road NW,
> Washington, DC  20057
> Office:  202-687-9379
> Fax:  202-687-1505
> http://www.georgetown.edu/users/adm35/  (Personal)
> http://www.clusters.arc.georgetown.edu/ (GUPPI Initiative)
> http://www.arc.georgetown.edu/ (Division)
> ==================
> Version: GnuPG v1.2.2 (GNU/Linux)
> Un/6gJLI8nMtKHh1ncoCF8Q=
> =sZnQ
> <adm35.vcf>_______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters

More information about the Bioclusters mailing list