You could use GFS on a fiber channel shared storage and have multiple NFS servers serving files to your cluster from it. This would help you out even if you keep everything locally on nodes because you still have to copy everything to the nodes and I assume you do it from your master node/NFS server. Copying 3-4 GB of data to 100 nodes from only one machine over Fast Ethernet/NFS (or ssh) can take forever. Goran Ceric System Administrator Washington University, St. Louis Department of Genetics, Eddy Lab goran@genetics.wustl.edu http://www.genetics.wustl.edu/eddy -----Original Message----- From: bioclusters-admin@bioinformatics.org [mailto:bioclusters-admin@bioinformatics.org]On Behalf Of Ivo Grosse Sent: Monday, May 13, 2002 10:32 AM To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] http://www.sistina.com/products_gfs.htm Hi Joe, thanks for your great *general* answer. Hi Joe and Chris and others, I try to make my question more *specific*: 0. we often use Blast, and we often blast two large sets against each other, e.g. the human against the mouse genome. In that example, one genome (e.g. mouse) will be the database, and we will chop up the human genome into, say, 101-kb pieces overlapping by 1 kb, and then throw those 30,000 101-kb pieces against the mouse database using SGE. We (in our group) do NOT need or want Mosix. 1. the (mouse) database will live in RAM (of each slave node), and the way in which we feed the database to the RAM for each of the 30,000 jobs is as follows: - cp the database to /tmp/ of ALL of the slave nodes. - start the 30,000 jobs through SGE, where the database is READ from /tmp/ (on the local node) and the output is WRITTEN to the central file server. This is, of course, much faster than reading a GB-size database from the central file server 30,000 times. 2. another group here at CSHL is currently in the process of preparing the installation of a new cluster, and they have some good reasons for choosing Mosix. But once in a wile they also need to run Blast jobs, of similar sizes as ours. The question is: can Mosix + GFS + DFSA support a protocol similar to 1.? Best regards, Ivo P.S. Instead of writing N identical replicas of the database to the N slave nodes, one could keep just one copy of the database on /pvfs/, which is accessible through all of the slave nodes. Then, however, the GB-size database would need to be read through the network 30,000 times. Is this correct? P.P.S. Do you know a smarter (than 1.) way of running the Blast jobs? _______________________________________________ Bioclusters maillist - Bioclusters@bioinformatics.org http://bioinformatics.org/mailman/listinfo/bioclusters