[Bioclusters] mpiblast and its database

Jeremy Mann bioclusters@bioinformatics.org
Thu, 15 May 2003 12:21:00 -0500 (CDT)


> Hi Jeremy,
>
> I have two questions for you.
> (1)After you did mpiformatdb -N 44, how many fragments
> did you get? Did you get 45 fragments or 44?
> (2)When you run mpiBlast, how many cpus or nodes you
> specify? (I mean, what is your number for xx in mpirun
> -np xx).

I actually ended up with 46 segments. mpiformatdb split it up into 10
meg segments, and protein nr is 459 megabytes, so the last 46th segment
(nr.45) is about 6 megabytes.

I run mpiblast as follows:

time mpirun -np 46 mpiblast -p blastp -d nr -i nr-protein.fasta -o
nrprotein-np46.out

The master node actually has 4 segments of the db. When I run mpirun in
verbose mode, it tells me I am starting 4 jobs on n0, then 2 jobs from
n1 to n20.

Also I have an update on the user situation. If you chmod -R 777 the
local storage directory, when another user runs mpiblast, he/she will
not experience the lag because of the segments being copied. This is the
only way around sharing local storage with a number of users. If your
really paranoid about security, you could always make an 'mpiblast'
users group and assign blast users to this group. I have tested this
with 3 user accounts, each one did NOT copy segments to local storage,
and the timed tests where very similar, anywhere from 5.4 to 5.9
seconds.



-- 
Jeremy Mann
jeremy@biochem.uthscsa.edu

University of Texas Health Science Center
Bioinformatics Core Facility
http://www.bioinformatics.uthscsa.edu
Phone: (210) 567-2672


-- 
Jeremy Mann
jeremy@biochem.uthscsa.edu

University of Texas Health Science Center
Bioinformatics Core Facility
http://www.bioinformatics.uthscsa.edu
Phone: (210) 567-2672