[Bioclusters] Re: [GE users] submit a job

chris dagdigian bioclusters@bioinformatics.org
Wed, 20 Nov 2002 11:35:14 -0500


Your life will be easier if your gridengine ncbi-blast system had the 
same path to the blastall binary on all your execution hosts. Can you 
set this up for instance with a symbolic links? It makes writing your 
scripts much easier. If all else fails you can put the blastall binary 
into an NFS share, presumably you already have at least one NFS share 
going for the sge stuff.

It would also be helpful to us if you posted the output of the blastall 
error message you are seeing-- believe it or not they actually do try to 
tell you what is going wrong. From memory I seem to recall that if you 
do not use the .ncbirc config file you need to set at least two 
environment variables: $BLASTDB points to the sequence databases and 
$BLASTMAT points to the directory that holds all of the scoring matrices.

Since you can't directly qsub calls to binaries with gridengine what you 
need to do is create a perl or shell script that will run the blast 
query for you. You should prove that this script will work when run 
individually from any execution node before you try sending it to 
gridengine via qsub.

I'd suggest following up with a reply strictly to the 
bioclusters@bioinformatics.org mailing list as this is starting to get 
off-topic for the sge-users list. The bioclusters folks deal with blast 
farms all the time.


bioinfo Gu wrote:

> Hi,
> I have installed the gridengine on two machines. Here is my structure: 
> one(athena) is master node(not execution node), another(apollo) is 
> execution node. I have submit the simple.sh and sleeper.sh, both of 
> them work fine. But I want to run my own jobs on gridengine. I 
> have installed 'blastall' on both machine,
> athena: /path/to/blastprogram/blastall
> apollo: /path/to/blastprogram/blastall
> these two paths are different.
> I have checked the environmental variable for blastall program on 
> execution node, it should be all right, but when I 'qsub 
> blastall_script', it always give me some error about blastall program. 
> so I doubt my environmental variable is not right under gridengine.
> My question is when I run 'qsub blastall' on athena(master), which 
> program should be invoke,  blastall will take agvantage of the 
> resource of each execution node, is it right?  Suppose that I have 
> more than one execution nodes, how can I explain the environmental 
> variables(or other resources). I found somewhere, but now I can not 
> remember.
> Thank you very much.
> Grace  

Chris Dagdigian, <dag@sonsorol.org>
Bioteam.net - Independent Bio-IT & Informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net