[Bioclusters] BioPerl 1.2.3 and memory handling

Michael Maibaum bioclusters@bioinformatics.org
Thu Nov 11 12:52:07 EST 2004


On 10 Nov 2004, at 18:25, Al Tucker wrote:

> Hi everybody.
>
> We're new to the Inquiry Xserve scientific cluster and trying to iron 
> out a few things.
>
> One thing is we seem to be coming up against is an out of memory error 
> when getting large sequence analysis results (5,000 seq - at least- 
> and above) back from BTblastall. The problem seems to be with BioPerl.
>
> Might anyone here know if BioPerl is knows enough not to try and 
> access more than 4gb of RAM in a single process (an OS X limit)? I'm 
> told Blastall and BTblastall are and will chunk problems accordingly, 
> but we're not certain if BioPerl is when called to merge large Blast 
> results back together. It's the default version 1.2.3 that's supplied 
> btw, and OS X 10.3.5 with all current updates just short of the latest 
> 10.3.6 update.

BioPerl tries to slurp up the entire results set from a BLAST query, 
and build objects for each little bit of the result set and uses lots 
of memory. It doesn't have anything smart at all about breaking up the 
job within the result set, afaik.

  I ended up stripping out results that hit a certain threshold size to 
run on a different, large memory opteron/linux box and I'm 
experimenting with replacing BioPerl with BioPython etc.

Michael



-- 
Dr Michael Maibaum
Department of Biochemistry and Molecular Biology, UCL
email: maibaum@biochemistry.ucl.ac.uk




More information about the Bioclusters mailing list