[Bioclusters] blastall and SGE

Joe Landman bioclusters@bioinformatics.org
Wed, 29 Sep 2004 15:39:31 -0400


Hi Juan:

  You might want to trim you cc list ... :)

  You should have a look at mpiBLAST.  It might be able to handle what 
you want, and we have a nice run script to drive it (works great with 
gridengine).  See 
http://www.scalableinformatics.com/metadot/index.pl?id=2213&isa=Category&op=show 
for more details.  We have a web-ified version of this for another 
customer, somewhat specialized to their environment.

Joe

Juan Carlos Perin wrote:

>
> Sun Grid Engine doesn't seem to utilize all the empty resources that 
> it should.  When I run btblastall on the command line on a search 
> against NT ( which has been partitioned into 15 segments), only three 
> machines actually get queued up for blastall jobs.  Also, during this 
> process I do not see processor usage ever going above 24%.  I would 
> hope, or expect that more, idle nodes, would receive blastall jobs 
> with one of the 15 segments of the DB.  This is very disappointing 
> considering a single G5 can search the NT database in under 3 minutes, 
> while running on multiple nodes actually takes well over ten minutes.
>
> I would also hope, but don't know how, to tweak or configure SGE to 
> allow more efficient usage of idle resources.  On the same note, it 
> seems that even running a regular blastall job from the command line 
> on a single machine is also somehow restricted to a certain amount of 
> CPU usage.  (usually no more than 60% CPU usage).  I'm wondering if 
> there is a way to allow greater CPU usage overall.
>
> The only work-around that seems to really work is running btblastall 
> on the command line with a database that has been forced to segment 
> into many more segments, rather than 15 (one for every node) into 30 
> or 32 (one for every processor).  This, on the command line, seems to 
> distribute jobs a little more efficiently, as well as utilizing more 
> CPU power than any other run.
>
> Any thoughts would be VERY helpful.
>
> Thanks,
> Juan
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 612 4615