I'll put 2 responses in one here... (the one one GigE and second this one) ... as they both address putting file systems on a wire for sharing them. Chris D did a great job talking about the differences between servers, I am going to talk about the scaling aspects of this. The name of this game is not to run out of bandwidth, either on the network (first portion) or on the disks (second portion). You need some numbers, and some back of the envelope calculations. Numbers can be found on drive makers web sites. Terminology: MB/s is mega byte per second, Mb/s is mega bit per second. There is (almost) a factor of 10 relating the two. Suppose you decide you want to hang some drives off of a GigE connected server. Each connected computer on a single 100 Base T link is capable of sinking or sourcing about 11 MB/s (about 90 Mb/s) running flat out. This isnt theoretical max, this is achievable or realizable bandwidth. If you can get this same 90% of peak on your GigE (1000 Mb/s) link, then you should be able to get about 900 Mb/s achievable. Quick division yields about 10 of the 100 Base T links per GigE link, before you fill up. If you are using single connections per machine, this is 1 GigE feeding 10 machines. If you channel bond your 100 Base T's, then you are looking at 1 GigE per 5 machines. This of course assumes that you are 1) fully saturating the 100 Base T pipes all the time, and 2) that you are doing large block sequential accesses (reads or writes). Reality is never so simple. You can measure your process I/O utilization at a coarse level by running vmstat 1 > /tmp/log.IO and then looking at the columns labeled bi and bo (blocks in and out respectively). It isnt that hard to calibrate a maximial load to an quiescent disk or file server, just copy a very large file there after launching the vmstat in the background. This will give you a max value for reading and writing (if you chose to do both). Now run your process. More often than not, you will see spikes to near the maximum, and then large periods of low usage. But it is possible that you will see long periods of intense IO. This is application dependent. From this you can estimate a "duty cycle" or a average utilization. You can eyeball this if you dont want to measure it, just make sure you err on the side of larger utilization (e.g. round up). So your 10 x 100 Base T interfaces will use on average this utilization percentage of the total bandwidth. What you want to do is to scale your number of machines hanging off this GigE interface so that the utilization multiplied by the number of 10 of the 100 Base T interfaces is close to something like 80% of the realizable bandwidth. This tells you approximately how many machines you can service from this GigE running this application, with this type of data. So, for your case, 40 machines, 1 interface per machine, would require an IO utilization below 20% per node to be really serviceable from the single GigE interconnect. If you are effectively streaming database indices off the disk to each node, this will be problematic. You will likely run out of network bandwidth at the 10 node mark. Now onto the disks themselves. Suppose that you have budget for a nice set of Seagate ST336752LW (15k RPM, 36.7 GB). These disks can (see http://www.seagate.com/cda/products/discsales/enterprise/tech/0,1084,379,00.html) sustain in excess of 50 MB/s (from 508 to 706 Mb/s). Remember that GigE is 1000 Mb/s. 2 of these disks could keep a GigE full if they are running flat out. If you put these into a Linux server versus a NAS box, you are going to run into a few issues. First: if your IDE/SCSI controller is on the same PCI (PCI 133 MB/s, 100 MB/s realizable) as your network (GigE), you are going to cause that poor machine to whimper. You need either a PCI 266 MB/s bus (200 MB/s realizable), or multiple PCI busses. Either way, this immediately takes most of the common non-server oriented machines off the block as potential candidates. You can easily fill up the PCI bus on these things with enough IO traffic. Second: ATA100 would require 2 channels to attach to this type of drive (you want one dedicated ATA100 channel per drive). Ultra160 is great, but you need a 266 MB/s bus to plug it into. 3 of the Seagate drives on that controller will pretty much max out the controller if the disks are running flat out. You cannot put 2 of these U160s on a single PCI 266 system. You wouldnt have room left for the GigE. Basically building these things into a linux box is somewhat hard. There are many little gotchas. Look at the boxes Chris indicated, and possibly the 3ware boxes as well. Some people I know swear by them. Joe On Thu, 2002-04-18 at 18:22, Ivo Grosse wrote: > Hi all, > > we want to buy a new fileserver (for our cluster) with about 1 TB, and > we are thinking of a Linux machine. My question is: which kind of > fileserver do YOU use (and why)? > > (a) NAS (network-attached storage)? > > (b) regular Linux machine with internal RAID? > > (c) regular Linux machine with external RAID? > > Thanks!!! > > Ivo > > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > http://bioinformatics.org/mailman/listinfo/bioclusters -- Joseph Landman, Ph.D. Senior Scientist, MSC Software High Performance Computing email : joe.landman@mscsoftware.com messaging : page_joe@mschpc.dtw.macsch.com Main office : +1 248 208 3312 Cell phone : +1 734 612 4615 Fax : +1 714 784 3774