On Thu, 24 Apr 2003, Bruce O'Neel wrote: > basically a fast serial connection is a bad idea. If you use 100 > megabit ethernet you max out somewhere around 40 megabits or so > because you can't use the full channel bandwith. This is an "urban myth" spread 15 years ago by people trying to push Token Ring. Astonishingly, it's still repeated despite being trivially disproved by everyday experience. Typical TCP/IP/Ethernet bandwidth delivered to an application is well over 90%. You should see 11-12MB/sec for a Fast Ethernet file transfer. The 35% or 40% number comes from a too-simple analysis of CSMA (e.g. Aloha) instead of CSMA/CD/EBO that Ethernet uses. But even that is moot, since every current Ethernet deployment uses a heavily buffered switch, likely with flow control, rather than a collision domain. > That, combined with modern OSs hard work to cache disks well, and then > combined with cheap IDE hard disks, means that it almost always is a > win to put your data locally. That I completely agree with: disk bandwidth is still the least expensive, most effective available. The key is dealing with the complexity of many disks; knowing when they are being used as a cache vs being used as a persistent store of unique data. > NFS is good for things like login directories where you read small > files once or twice and for source code repositories where you don't > keep re-reading the files. NFS is an exceptionally efficient protocol for read-only small files, and login directories are an excellent example of rarely changed, concise configuration files. > NFS is very bad for big files since (basically) every 8k bytes or so > requires the file to be reopened on the server, then you have to seek, > then 8k bytes is read, and then closed again. That's a mis-characterization: each NFS request is independent (more precisely, idempotent), but the server isn't implemented as a open()/seek()/read|write()/close(). I wrote one of the first non-Sun NFS servers, implemented as user-level code, so I'm familiar with the implementation options. That's not to say that NFS is _good_ at reading large files. It's not. But the real weakness of NFS is the consistency model when writing or doing directory modifications. > To make things worse some labs then do the incremential aproach to > NFS... > Far better is to have a central NFS server for all of your home > directories, and then have your central archives > mirrored/rsynced/whatever to your different compute nodes. I completely agree that NFS isn't a good cluster construction solution. While it's possible to build a reasonable system by tuning the NFS attribute and data caching parameters on a per-directory basis, this adds major complexity to the file system configuration, involves much expertise and requires revisiting the decisions with updates or usage changes. There are better, more efficient and manageable solution for building clusters than just blindly mirroring/rsync/imaging a whole installation to each node. PowerCockpit is an example of where that leads. The copying part is trivial. The complexity is the scripts to configure each application. And when you are finished, you just have a quick way to do a reinstall, not an effective way to manage many similar machines. -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993