[Bioclusters] Xserve G5 memory

Juan Carlos Perin bioclusters@bioinformatics.org
Tue, 05 Oct 2004 16:19:01 -0400


Well, for these benchmarks we focused on the larger NT database.  This is
the only database that really shows significant lag time in search runs due
to its large size, but I have confirmed that the database is not overflowing
into swap space.  This wasn't initially assumed because we have plenty of
memory, but I verified it anyway.

OSX can be 'kernel tuned' through 'sysctl' but it doesn't seem obvious how
to do this to allow retention of a flatfile database for this type of an
application.  

One suggestion, according to Aaron Darling suggest creating a RAMdisk in
OSX, which I will look into, and if this is the case, we may very well just
dedicate a node for this reason, since the search time when the DB is in
memory is so much faster, and it is very important in our scenario.

Thanks
  


On 10/5/04 4:01 PM, "Joe Landman" <landman@scalableinformatics.com> wrote:

> Unless you are using a memory mapped process reading these same files at
> the same time, it is likely that they are only showing up in buffer
> cache.  OSX probably has a very different cache retention policy as
> compared to AIX and Linux.  This is usually a kernel tunable.
> 
> Ask your Apple folks about how to tweak the kernel.
> 
> Another very important issue is the size of the file as compared to the
> size of memory (more specifically buffer cache).  If the file overflows
> memory, the mmap mechanism will happily ask the kernel to start paging.
> This is "A Bad Thing(tm)".  You want your database index small enough to
> hold in memory for good performance.  You don't want them too small, as
> you will start to pay some costs associated with increased file activity.
> 
> Which database is being used?
> 
> Victor M.Ruotti wrote:
> 
>> Hi Juan,
>> How exactly do you hold your databases in memory. Do you it through
>> programming? It may help to describe how exactly this is done. I am
>> also curious to know how you do it.
>> 
>> Victor
>> On Oct 5, 2004, at 12:09 PM, Juan Carlos Perin wrote:
>> 
>>> I have been running benchmarks with blastall on several different
>>> machines.
>>> We've come to realize that one of the biggest differences affecting
>>> search
>>> times is how the machines actually maintain the search databases in
>>> memory.
>>> 
>>> Eg..  On our IBM 8-way machine, the databases are held in the memory,
>>> which
>>> seems to be an effect of the architecture of the machine, and search
>>> times
>>> become incredibly fast after an initial run, which stores the
>>> database in
>>> memory.  The same effect seems to take place on our Dual Xeon Dell (PE
>>> 1650), which also outpaces the Xserves significantly after an initial
>>> run to
>>> populate the db in  memory.
>>> 
>>> It would appear the the Xserves dump the db from memory after each
>>> search,
>>> even when submitting batch jobs with multiple sequences in a file.  Is
>>> anyone aware of how this functions, and how this effect might be
>>> changed to
>>> allow the db to stay in memory longer?  Thanks
>>> 
>>> Juan Perin
>>> Child. Hospital of Philadelphia
>>> 
>>> _______________________________________________
>>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>>> https://bioinformatics.org/mailman/listinfo/bioclusters
>> 
>> 
>> _______________________________________________
>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bioclusters
>