[Bioclusters] BLAST/ PBS / Grid Engine

Fri, 17 May 2002 19:10:28 -0400

Ok,

Web-based BLAST services are difficult because unlike high throughput 
BLAST pipelines running on compute farms you can never predict what the 
query or target database is going to be.

First off-  batch queueing systems like PBS, LSF and GridEngine are not 
all that great for backending web applications where speedy response 
time is needed.

The reason is that there is overhead involved in submitting a job, 
having it be accepted by the scheduler and then have it dispatched for 
execution, return the output etc. etc.

In LSF it can take up to 30 seconds to dispatch a job even on an idle 
cluster. You (of course) can turn this time value down via a 
configuation setting but at the expense of adding system overhead.

What it comes down to is that for very short running jobs (say taking 
less than 60 seconds to run to completion) you will pay a penalty for 
using a batch queuing system.

There is a workaround: I don't know about PBS but both LSF and 
GridEngine have a program that will totally skip the scheduler and will 
execute your command right away on the "least busy" system.

In LSF the program is called "lsrun"

In GridEngine the program is called "qrsh"

This is how I've built such systems and seen how other people do it:

(1) For each blast request made via the web do some sort quick triage 
check against the target database. Your goal is a rough prediction of 
the time it will take to complete the search.

(2) If the database is "small" then immediatly send the job to "lsrun" 
or "qrsh" and possibly block on an open pipe to the standard output 
stream so you can send the search result straight back to the browser.

(3) If the database is "large" and you know the search is going to run 
for minutes anyway then submit it formally to the cluster for execution. 
For these sorts of jobs there is usually a user option to have the 
results delivered via email or some other mechanism.

A question that someone might ask at this point could be: "Why bother 
with LSF or GridEngine at all? I'll just rsh/ssh the command and run it 
directly on a remote node!". The answer to this is that both LSF and 
Grid do far more then just check system load via the process table or 
whatever. I think that LSF monitors something like 15 different metric 
values per system so it can understand the "true" load of the system at 
any given time. This gives you better performance and throughput then 
trying to manually figure out load or doing some round-robin scheme.

There is one other catch though that I remember from a project at 
Research Genetics-- I think that in GridEngine that if there is no 
available machine then your 'qsub' will exit with some sort of error. 
This breaks the easy method of just grabbing the STDOUT and sending it 
to the browser. I may have the details wrong but I do remember that 
there were a few issues involving the use of qrsh on heavily loaded 
cluster systems. I also cannot remember what lsrun will do in that 
situation although I'm sitting right next to a big LSF cluster and 
perhaps can play around a bit.

Both 'lsrun' and 'qrsh' will take a command line argument that lets you 
specify hosts that you would prefer to run your command on. If you want 
to get clever you can write your system such that you always execute a 
search of database X against machines Y,Z  so you can take advantage of 
any cached in-memory databases that could still be resident from the 
previous search. Only works with databases small enough to sit in RAM 
though.

<topic change>

I'm not partial to the current blade servers because many of them (like 
the RLX system I played with 6 months ago) end up using cheezy 4200 RPM 
laptop disk drives as the main OS disk. Cheezy laptop drives are simply 
not fast enough to deal with the large number of IO-bound applications 
that bio people typically run. In particular I'd hate to run BLAST on 
one of them.

I do know that James Cuff over at the Ensembl.org project (operaters of 
a badass genome annotation cluster) has been running a serious "blade 
server bake-off" and he has promised to reveal the results of his tests.

James and the folks over at the Sanger Centre / Welcome Trust Campus 
have a _seriously hardcore_ IT infrastructure and I'm really interested 
to hear what they think of the blade tests that they have been running.

-Chris

Steve Pittard wrote:
> First the question:
> 
> Does someone know of certain combinations of load management software and
> OS (e.g. PBS on Scyld or LSF on RedHAt) which have are particulalrly good
> at helping one manage web based Blast submissions ? 
> 
> Now the context:
> 
> I've been offerring a web based blast service
> to my local user community. Its a small emulation of what one finds
> at the NCBI Blast site. We currently have an NCBI-ish web front end
> for some perl scripts which perform the blast and return the results. All
> in all pretty usable stuff except that demand has driven up the load
> averages on my server (a 2XCPU Dell poweredge w Red HAt 7.2). Several
> searches of "nr" can slow things down quite rapidly.
> 
> So I've begun experimenting with OpenPBS to smoothe the load 
> on the server and keep it running well. So far so good but since
> I don't have a cluster cluster yet, I haven't experimented with passing 
> off jobs to other nodes.
> 
> Knowing that Blast (as distributed by NCBI) 
> is not parallel I think that the best
> I can do for the web based queries is to let PBS assign
> the blast jobs to less busy PBS nodes to avoid the logjam.
> I'm fairly certain that no load sofatware (PBS, Grid Engine,
> LSF) can take Blast (or more generally any  non-parallel app) 
> and spread out its CPU needs amongst the cluster. Is this 
> assessment correct ? 
> 
> I realize that for batch blasting that many people "chop
> up" the database over the nodes, formtdb the chunks, and
> blast the queries against these chunks. Perl scripts
> like disperse.pl also segment the larger Blast into more
> manageable pieces. But this isn't scalable for Web queries
> that might occur several times a minute. So In my situation
> I have the Dbs (e.g. nr, swissprot, plant, etc ) "formatdbed" 
> on a server disk with the ultimate intention of having it 
> on cluster nodes perhaps with NFS over gigabit. 
> 
> RLX technologies sells an LSF based
> "Blast server" which is aimed sqaurely at the "I want to blast 
> thousands of sequences  at once" batch blast market though 
> , again, what I'm doing is not really that since my blast requests
> come in over the web on a frequent basis. But I've been working
> with them a bit on my particular situation. 
> 
> Anyway I have been looking at other "proper" cluster systems 
> and have been wondering which setup would best benefit 
> the type of Blasting that I'm interested in. Strongly 
> related to this question is the type of load management 
> software to use and on what platform. I've been using PBS 
> on Red Hat and so far so good but have heard good things 
> about LSF and Grid Engine.
> 

-- 
Chris Dagdigian, <dag@sonsorol.org>
Independent life science IT & research computing consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
Work: http://BioTeam.net PGP KeyID: 83D4310E  Yahoo IM: craffi