[Bioclusters] Bioclusters Digest, Vol 10, Issue 9

bioclusters-request at bioinformatics.org bioclusters-request at bioinformatics.org
Tue Aug 16 12:12:49 EDT 2005


Send Bioclusters mailing list submissions to
	bioclusters at bioinformatics.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://bioinformatics.org/mailman/listinfo/bioclusters
or, via email, send a message with subject or body 'help' to
	bioclusters-request at bioinformatics.org

You can reach the person managing the list at
	bioclusters-owner at bioinformatics.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioclusters digest..."


Today's Topics:

   1. RE: Sequest on Linux (Brodie, Kent)


----------------------------------------------------------------------

Message: 1
Date: Tue, 16 Aug 2005 10:43:36 -0500
From: "Brodie, Kent" <brodie at mcw.edu>
Subject: RE: [Bioclusters] Sequest on Linux
To: "Clustering,	compute farming & distributed computing in life
	science informatics"	<bioclusters at bioinformatics.org>
Message-ID: <8F78639AC56F4143B267FE5F5A1B92C8C5E0 at guyton.phys.mcw.edu>
Content-Type: text/plain;	charset="us-ascii"

Hi.

As Simon introduced, we're running Sequest on Linux.   The installation
happens to be Suse 8; that's what was chosen because that provided the
best powerpc support at that time.   Ultimately, we'll likely switch the
nodes to RedHat later this year to match up with out other Linux
servers.   For Sequest, the particular flavor does not matter too much.
The head node of our Sequest environment is a Windows 2000 Server.  That
was a requirement from what I know.  (The JS20/Sequest install here
pre-dated my arrival...).

>From what I have seen, there's not really all that much I/O required to
make the Sequest animal work.   Most of the bandwidth needs are really
going to be on the network side, I believe.  That's where our JS20
bladecenter excels, because of the common network backplane.   The
chunks of data being analyzed are really nothing more than bits of
text.. 

The Sequest head node basically blasts stuff to the remote worker nodes
via RSH/etc.   The raw files (FASTA and so on) used for comparative
analysis are all copied to the worker nodes "ahead of time", and the
data file chunks being analyzed just really aren't that large.    In out
environment, the JS20's have two little 40-GB 2.5" internal SCSI drives
on the blades.    The primary drive only has 3G used (operating system,
apps, sequest), and the secondary drive only has 1.6G used (raw data
files).  I do not suspect you're going to notice huge Sequest time
differences based on the drives.....   (I could be wrong?).   

For backups, we really don't care much, since each worker node is more
or less a clone of the other.   A dead node is easily replicated.   We
do keep weekly backups of the first node "just in case".

For integration, my understanding is that A Sequest-installed series of
systems can co-exist with other job scheduler environments on the same
cluster.    As I mentioned earlier, RSH playes a huge part in keeping
Sequest talking.    The technical communication between the nodes just
really isn't that complicated, and my assumption is something like
LSF/SGE/PBS/etc could peacefully co-exist.

If you have further scientific-like queries, Simon will tackle those,
and I'll be happy to address any other sysadmin-like questions you may
have.

--Kent C. Brodie, MS
  Department of Physiology
  Human & Molecular Genetics Center
  Medical College of Wisconsin




> -----Original Message-----
> From: bioclusters-bounces+brodie=mcw.edu at bioinformatics.org
> [mailto:bioclusters-bounces+brodie=mcw.edu at bioinformatics.org] On
Behalf
> Of Botka, Christopher
> Sent: Monday, August 15, 2005 11:46 PM
> To: bioclusters at bioinformatics.org
> Subject: [Bioclusters] Sequest on Linux
> 
> 
> Is anyone out there running Sequest on Linux for MS analysis?  We are
in
> the process of setting up a modest sized cluster to run Sequest and
would
> be interested in sharing info and experiences with anyone out there
who
> might be doing the same.
> 
> Some issues:
> 
>    1. I/O requirements - what's the minimum thruput needed to run
Sequest.
> We are gong to test both SATA and FC drives with multiple types of
> interconnects, as well as local SCSI drives.
>    2. Integration of the Thermo queuing system with other job
management
> systems (LSF/SGE etc) - Can Sequest be integrated into a general
purpose
> cluster?
>    3. Middle to long term storage requirements and back up strategies.
> 
> Thanks,
> 
> Chris
> 
> botka at joslin.harvard.edu
> 
> 
> 
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters


------------------------------

_______________________________________________
Bioclusters maillist  -  Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters


End of Bioclusters Digest, Vol 10, Issue 9
******************************************


More information about the Bioclusters mailing list