[Bioclusters] Apple/Genentech's new version of Blast

James Cuff bioclusters@bioinformatics.org
Fri, 22 Nov 2002 16:09:52 +0000 (GMT)


On Fri, 22 Nov 2002, chris dagdigian wrote:

> Elia also mentioned that 'blastn does not work as well on 2-CPUs" and
> I'd have to echo that thought from my experiences in years past at
> Genetics Institute. I _always_ got better blast throughput by
> constraining the blast search to run only on a single CPU.   Most of the
> people I know who do blast farms are doing the same thing I  believe --
> they constrain blast to run only on a single CPU and compensate for
> throughput by loading up a dual-CPU machine with 2 searches at a time.

In a bit of an <aol>me too</aol>, we run exactly this type of setup here
at the Sanger, and have done for a number of years.

The single processor DS10L's and RLX blades all run one job each.  There
is no over commitment, it really is a case of one singer, one song.

The multiprocessor ES4{0,5}s are all set to run with 4 job slots each.
blastn in particular is heavy on I/O, so the local storage deal that Elia
talked about is particularly important to get decent performance.

Again to reinforce what Chris said, we are currently on 1GB/CPU on the
blades and a minimum of 2GB/CPU on the ES4{0,5}s.  This is great, as the
UBC fills up with the target database, and future jobs all run out of
memory, as long as they fit.  Only problem is that the pesky Human genome
is slightly over 3GB's so you have to put in some smarts to make it fit...

I mentioned a while back on this list that we were looking at software
RAID 0 on the RLX blades.  We put this into production, and see ca. 50%
speed up in terms of I/O, which in essence is what large scale genome
analysis is all about...  The Alphas are a different story, as they run
AdvFS as a balanced file set.  Although again since our mega 360xv5.1a
upgrade, we are now able to use link aggregation (aka channel bonding),
which has helped our network contention no end.  We also recently ditched
ATM APIMs for GB ethernet up-links - the over all difference is shocking...

There was some mention of switches a while back too, we are running with
Extreme Networks kit.  It is seriously good as the custom processor based
technology really makes it fly, all ports non blocking at full speed is
really what you want in this game :-)

Ok brain dump over...

Best regards,

J.

--
Dr. James Cuff
Group Leader - Informatics Systems Group
Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.
Tel: +44 (0) 1223-494880  Fax: +44 (0) 1223-494919