[Bioclusters] Re: resources on administering clusters

03 Apr 2002 09:17:05 -0500

On Wed, 2002-04-03 at 03:44, Kris Boulez wrote:

[...]

> Does anyone on this list know how to evaluate the memory usage of a
> program running under Linux. We have a program that runs a long time

There are a number of tools: time, memprof, memusage, sar, vmstat

memprof (a RedHat-ism) and memusage are both reasonably good.  memusage
comes in the glibc-common rpm from RedHat (for the 7.2 distribution).

I use memusage like this

	memusage -p prime1.png -T -t ./prime1.pl 10101010121

and it builds a png plot for me of my memory usage as a function of
time.

Memprof allows me to run a program as a child process and examine it as
it runs, or in its totality at completion.  It is graphical.

> (days) and has a pretty big VmSize (~1.8 GB). Much to our surprise this
> runs pretty well on a 512MB machine (VmRSS around 300000 KB), although
> the authors of the code claimed that the code does access all the memory
> it has requested for (in a semi-random way).
> I have looked at graphs of VmSize and VmRSS vs. time and output of sar
> (~5 pages per sec paged in / out ).

My first question would be, what are your limits set to?  That is, look
at (under tcsh)

    [landman@protein.dtw.macsch.com:~]
    17 >limit
    cputime 	unlimited
    filesize 	unlimited
    datasize 	unlimited
    stacksize 	8192 kbytes
    coredumpsize 	unlimited
    memoryuse 	unlimited
    descriptors 	1024 
    memorylocked 	unlimited
    maxproc 	2047 
    openfiles 	1024 

make sure your datasize, memoryuse and other items are set properly. 
Second question would be how large is your data set?  Third would be how
does the program determine how much memory to use (from the get*
functions, from other functions, from /proc/meminfo,...)

> Any other things I might look at and give me an idea about which parts
> of the memory of the program are accessed and how often ?

I would look at recompiling the program with profiling turned on. 
Running a gprof against a program compiled with -pg (for the gnu
compiler) lets you see where it is spending its time.

It is possible your data set is too small for the program to use all the
memory, so it allocates what it needs, and uses the rest as a buffer
cache.  You can use xosview to look at this at a gross level.  It is
possible that the program looks at some particular function that reports
free memory size, and that value is low.  The issue then becomes one of
paging.  If your program implements paging by itself (e.g. tries to do
its own memory management), then under a memory constrained
circumstance, you would expect the block in/out rates to be quite high. 
If I read your numbers correctly, this does not appear to be the case.

> Kris,
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> http://bioinformatics.org/mailman/listinfo/bioclusters

-- 
Joseph Landman, Ph.D.
Senior Scientist,
MSC Software High Performance Computing
email		: joe.landman@mscsoftware.com
messaging	: page_joe@mschpc.dtw.macsch.com
Main office	: +1 248 208 3312
Cell phone	: +1 734 612 4615
Fax		: +1 714 784 3774