Hi Ognen: The price knee is at about the 60GB region. 40GB drives are in the 60-70$US region and 60GB are in the 80$US region. I would recommend using software RAID0 with IDE (not hardware raid). Joe On Thu, 24 Apr 2003, Duzlevski, Ognen wrote: > Hi Bruce, > > given the (ever increasing) sizes of databases bioinformatics software is run against - what size of local space would you recommend? > > Ognen > > >-----Original Message----- > >From: Bruce O'Neel [mailto:bruce.oneel@obs.unige.ch] > >Sent: Thursday, April 24, 2003 9:55 AM > >To: bioclusters > >Subject: [Bioclusters] Re: blast and nfs > > > > > >Hi, > > > >I thought that I'd emphasize a few things that Chris and Joseph have > >already said. > > > >Except for a few small subfields, scientific computing tends to be i/o > >bound. As already pointed out, feeding a lot of data through what is > >basically a fast serial connection is a bad idea. If you use 100 > >megabit ethernet you max out somewhere around 40 megabits or so > >because you can't use the full channel bandwith. This is somewhere > >around 4 or so megabytes per second, which most of you will recognize > >is way below the low end of one hard disk. Things only improve by a > >facter of 10 or so if you use gigbit ethernet so that doesn't really > >save you there either. > > > >That, combined with modern OSs hard work to cache disks well, and then > >combined with cheap IDE hard disks, means that it almost always is a > >win to put your data locally. Using disk striping helps even more but > >may not always be necessary and should be tested. > > > >NFS is good for things like login directories where you read small > >files once or twice and for source code repositories where you don't > >keep re-reading the files. > > > >NFS is very bad for big files since (basically) every 8k bytes or so > >requires the file to be reopened on the server, then you have to seek, > >then 8k bytes is read, and then closed again. > > > >To make things worse some labs then do the incremential aproach to > >NFS, where as you add each system the spare disk space on that system > >is dedicated to something, and then mounted on all other systems. > >This is very bad since then for most work to happen ALL systems have > >to be up and functioning. Plus you end up with NFS traffic all over > >your network. It does keep your switch busy though :-) > > > >Far better is to have a central NFS server for all of your home > >directories, and then have your central archives > >mirrored/rsynced/whatever to your different compute nodes. > > > >Of course, your mileage may vary since each lab is different. > > > >cheers > > > >bruce > > > >-- > >.. there is no area or function that someone can't try to put together > >with bubble gum and bailing wire. -- Strata Chalup > > > >Bruce O'Neel phone: +41 22 950 91 57 > >INTEGRAL Science Data Centre +41 22 950 91 00 (switchb.) > >Chemin d'Ecogia 16 fax: +41 22 950 91 35 > >CH-1290 VERSOIX e-mail: Bruce.Oneel@obs.unige.ch > >Switzerland WWW: http://isdc.unige.ch/ > > > >_______________________________________________ > >Bioclusters maillist - Bioclusters@bioinformatics.org > >https://bioinformatics.org/mailman/listinfo/bioclusters > > > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters >