Chris Dagdigian wrote: > > Diskless clusters are generally a bad idea for any system that will be > running life science informatics workflows. In general terms (without Agreed with caveats. > knowing your specific applications) it is generally a rule that most > life science codes are performance bound by memory and disk access I/O Yes and yes. > limitations. If you build a diskless cluster and do lots of assemblies > on genomes that are larger than what can fit in memory your cluster is > going to be bogged down while nodes wait on network I/O requests. If you can provide a 100 MB/s link to IO locally, and a 1000 MB/s link to IO remotely, the latter is likely to be faster, irrespective of the disk stateful/stateless nature. > You can always tell a HPC vendor who knows nothing about life science > because they try to sell you (a) MPI-optimized "beowulf" systems or (b) Er... need to be careful here, MPI-HMMer, mpiBLAST, mpiClustalW and other MPI codes do have some latency sensitivity, more scalability limitations due to design issues in their work scheduler, but they also need to schedule IO. Infiniband (SDR) is not a bad technology (nor is 10GbE, though that is not cost effective yet, getting there, but not yet). > diskless systems, both of which are totally (in most cases) > inappropriate technology choices for our styles of work. ... sometimes ... . I am seeing a higher proportion of systems that are also doing MD and additional latency sensitive stuff. That said, IB and others really don't make sense below a certain size unless your code is latency sensitive by design. And this does happen, though less frequently in informatics than other areas. The aforementioned mpi codes are examples of some that are going this route. We are seeing ~60x with mpiHMMer on 64 cores, before we use hardware acceleration. If we run in an "overcommitted manner" (many more cores per IB port) we see this impact. If we run over pure gigabit, the scalability is lower, though most folks won't care about 50x vs 60x. > I think you are on to something with the SMP idea. You should research > how many CPUs and how much memory you can pack into a single chassis > along with a good performing storage subsystem. [not a commercial, just info] We have built desktop units that have 4-8 cores, 32 GB ram, and >1TB disk with 500+ MB/s bandwidth for customers running research apps (CFD, bio, chem, ...) . With the soon to be available Barcelona, we will be able to pack 16 cores, 64 GB ram and a fast IO subsystem in a single system image deskside (SMP). We are seeing *lots* of interest in these systems, including people wanting to build clusters out of them. Having used a dual and then a quad core unit as a desktop for a while, I can tell you it is quite nice when you have enough ram and IO capability. > My $.02 of course ! -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615