[Bioclusters] blast and nfs

Mon, 21 Apr 2003 14:33:53 -0500

Chris,

thank you for the info. I suspected you would say that I will need local =
drives ("first pass" solution). Ahhh, more money down the drain? :)

Thanks!
Ognen

>-----Original Message-----
>From: Chris Dagdigian [mailto:dag@sonsorol.org]
>Sent: Monday, April 21, 2003 2:18 PM
>To: bioclusters@bioinformatics.org
>Subject: Re: [Bioclusters] blast and nfs
>
>
>Duzlevski, Ognen wrote:
>> Hi all,
>>=20
>> we have a 40 node cluster (2 cpus each) and a cluster master that has
>>=20
>attached storage over fibre, pretty much a standard thingie.
>>=20
>> All of the nodes get their shared space from the cluster master over
>nfs. I have a user who has set-up an experiment that fragmented a
>database into 200,000 files which are then being blasted against the
>standard NCBI databases which reside on the same shared space on the
>cluster master and are visible on the nodes (he basically=20
>rsh-s into all
>the nodes in a loop and starts jobs). He could probably go about his
>business in a better way but for the sake of optimizing the setup, I am
>actually glad that testing is being done the way it is.
>>=20
>> I noticed that the cluster master itself is under heavy load (it is a
>>=20
>2 CPU machine), and most of the load comes from the nfsd=20
>threads (kernel
>space nfs used).
>>=20
>> Are there any usual tricks or setup models utilized in setting up
>clusters? For example, all of my nodes mount the shared space with
>rw/async/rsize=3D8192,wsize=3D8192 options. How many nfsd threads =
usually
>run on a master node? Any advice as to the locations of NCBI databases
>vs. shared space? How would one go about measuring/observing for the
>bottlenecks?
>
>
>Hi Ognen,
>
>There are many people on this list who have similar setups and have=20
>worked around NFS related bottlenecks in various ways depending on the=20
>complexity of their needs.
>
>One easy way to avoid NFS bottlenecks is to realize that BLAST is=20
>_aways_ going to be performance bound by IO speeds and that generally=20
>your IO access to local disk is going to be far faster than your NFS=20
>connection. Done right, local IDE drives in a software RAID=20
>configuration can get you better speeds than a direct GigE=20
>connection to=20
>a NetApp filer or fibrechannel SAN.
>
>Another way to put this: You will NEVER (well, without exotic storage=20
>hardware) be able to build a NFS fileserver that cannot be swamped by=20
>lots of cheap compute nodes going long sequential reads against=20
>network-mounted BLAST databases. You need to engineer around the NFS=20
>bottleneck that is slowing you down.
>
>All you need to do is have enough local disk in each of your compute=20
>nodes to hold all (or some) of your BLAST datasets. The idea=20
>is that you=20
>use the NFS mounted blast databases only as a 'staging area' for=20
>rsync'ing or copying your files to scratch or temp space on=20
>your compute=20
>nodes. Given the cheap cost of 40-80gb IDE disk drives this is a quick=20
>and easy way to get around NFS related bottlenecks.
>
>Each search can then be done against local disk on each compute node=20
>rather than all nodes hitting the NFS fileserver and beating=20
>it to death...
>
>This is generally what most BLAST farm operators will do as a "first=20
>pass" approach. It works very well and is pretty much standard=20
>practice=20
>these days.
>
>The "second pass" approach is more complicated and involves=20
>splitting up=20
>your blast datasets into RAM-sized chunks, distributing them=20
>across the=20
>nodes in your cluster and then multiplexing your query across all the=20
>nodes to get faster throughput times. This is harder to=20
>implement and is=20
>useful only for long queries against big databases as there is=20
>a certain=20
>amount of overhead required to merge your multiplexed query=20
>results back=20
>into one human or machine parsable file.
>
>People only implement the 'second pass' approach when they really need=20
>to. Usually in places where pipelines are constantly repeating=20
>the same=20
>big searches over and over again.
>
>
>My $.02 of course
>
>-Chris
>www.bioteam.net
>
>
>
>
>
>_______________________________________________
>Bioclusters maillist  -  Bioclusters@bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bioclusters
>