[Bioclusters] Re: BLAST job on SGE

Fri Dec 3 18:41:15 EST 2004

Wouldn't it be better to set up the SMP PE and then simply use the -a
argument for blastall?

Or is that not supported on the Mac?

P.S. I would also recommend looking into mpiBLAST

Cheers,

Bernard 

> -----Original Message-----
> From: bioclusters-bounces at bioinformatics.org 
> [mailto:bioclusters-bounces at bioinformatics.org] On Behalf Of 
> Chris Dagdigian
> Sent: Friday, December 03, 2004 12:12
> To: Clustering, compute farming & distributed computing in 
> life science informatics
> Subject: Re: [Bioclusters] Re: BLAST job on SGE
> 
> 
> The approach suggested below will solve the "how do I submit 
> one job that will get 100% of a SMP machine" problem but will 
> not solve the desire to stop the SGE scheduler from filling 
> all the available job slots on a single box before moving on 
> to a different compute nodes.
> 
> The "smp" trick below works by setting up a Parallel 
> Environment called "smp" within grid engine.
> 
> You sould create the PE by running the command "qconf -ap 
> smp" and filling in the values listed below.
> 
> Once that was done you run jobs that would take 100% of the 
> available job slots on any given node. The end result is your 
> job gets sole use of the machine while it runs.
> 
> On a cluster of dual-cpu boxes you would submit your job like this:
> 
>   $ qsub -pe smp 2 ./my-job-script
> 
> In effect you are asking for 2 parallel job slots and since 
> this happens to match the sum total of slots available on a 
> 2way system you end up getting sole use of the machine while 
> your job runs. You are never really doing any parallel work, 
> just using the PE mechanism to take >1 job slots for your job.
> 
> There are lots of other approaches that may be better for 
> particular people:
> 
> 1. If this is your standard use case and you *never* want to 
> allow more than one job to run on a node at any time then the 
> most simple solution is just to edit your SGE queue 
> confguration and set the "slots" value to "1" on each compute 
> node. That will force the scheduler to only allow one-job-per 
> node at any time.
> 
> 
> 2. Another (wierd) method is to assign a numerical value to "$seq_no" 
> within each of your queues and then adjust the SGE 
> scheduler's value of "$queue_sort_method" so that 
> "$queue_sort_method=seqno". If you do that then SGE will 
> attempt to farm out jobs according to the order in which 
> queues have set their $seqno value. This is probably not 
> optimal for most people.
> 
> 3. The most interesting approach is detailed in the manpage 
> for "sched_conf" where the description for the 
> '$job_load_adjustments' 
> parameter says this:
> 
> 
> >        If your load_formula simply consists of the  CPU  
> load  average  parameter
> >        load_avg and if your jobs are very compute 
> intensive, you might want to
> >        set the job_load_adjustments list to  load_avg=100,  
> which  means  that
> >        every new job dispatched to a host will require 100 
> % CPU time and thus
> >        the machine's load is instantly raised by 100.
> 
> 
> This could be a way of getting round-robin allocation done 
> outside of a 
> parallel environment. In effect you artifically boost the 
> internal load 
> value  that SGE "sees" right after the job starts which 
> should have the 
> affect of causing the SGE scheduler to move on to the *next* machine 
> rather than packing more jobs into any more remaining job 
> slots. Setting 
> load_avg to 100 should be fine beause the normalized 
> np_load_avg value 
> is aware of multiple-CPU SMP systems.
> 
> 
> -Chris
> 
> 
> 
> 
> 
> 
> Juan Carlos Perin wrote:
> 
> > I'm also interested in this, and have been playing with the 
> configuration.
> > Our friends at Penn have fixed this and created a queue 
> configuration that
> > allows single jobs to execute on all available nodes, as 
> opposed to running
> > two jobs per node, which isn't desirable.  It seems to me 
> that the idea was
> > to run one blast job on each CPU, thus two jobs on each 
> machine, but the
> > architecture doesn't necessarily work like that, and 
> instead waits for one
> > to finish, or re-queues the job.
> > 
> > This is a configuration that was suggested by the very kind 
> people at Penn,
> > that seems to work for this situation:
> > 
> >         # qconf -sp smp
> >         pe_name           smp
> >         queue_list        all
> >         slots             999
> >         user_lists        NONE
> >         xuser_lists       NONE
> >         start_proc_args   NONE
> >         stop_proc_args    NONE
> >         allocation_rule   $pe_slots
> >         control_slaves    FALSE
> >         job_is_first_task FALSE
> > 
> > I have yet to test it myself.
> > 
> > Juan Perin
> > Bioinformatics Core
> > Children's Hospital of Philadelphia
> > 
> > _______________________________________________
> > Bioclusters maillist  -  Bioclusters at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bioclusters
> 
> -- 
> Chris Dagdigian, <dag at sonsorol.org>
> BioTeam  - Independent life science IT & informatics consulting
> Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
> PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
> 
>