[Bioclusters] BLAST speed mystery

Justin Powell jpowell at takedacam.com
Tue Mar 2 13:19:15 EST 2010


I should clarify that the actual BLAST executable is running on the same Intel based Linux server in both cases, it's just that in one case the data is coming off of local RAID0 15K SAS drives and in the other the data is coming in over NFS from the Mac server. So there is no BLAST version on the G4.

Meanwhile I did some tests on altering the read-ahead setting on the RAID0 config on the local drives. The default read-ahead is 256 sectors, which corresponds to 128K. I've found that putting this up to 1024 sectors (512K) or higher reduces the times significantly (see below). However tests with dd show that 1024 sector read-ahead is not really any faster than 256 sector read-ahead for simple streaming of a file (both are a lot better than no read-ahead). So I'm wondering whether this still fundamentally some kind of IOPs limitation resulting from BLAST not streaming the entire file in but instead making random accesses to it - and the larger read-ahead just means that the random access is more likely to read data already present in RAM. 

So now I can get improved performance sufficient for my needs by tweaking the read-ahead, but I'm still curious as to what is going on.

I have run vmstat - however this does not seem to report any IO for the NFS based blast. I assume the IO column relates only to local disk IO? For the server this data is taken from, uname -a gives: 
2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
The RAID hardware in this case is a Dell SAS 6/iR, but I get similar results from the other system which has a PERC6i.

BLAST here is 32 bit (64 bit gives less advantage to NFS vs 256 sector read-ahead of local disks).

Results are:

BLASTN with data on local RAID0 drives, read-ahead 1024, BLAST completes in 24 seconds
[root at cam-clu-01 testblast]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 24450688    644  36156    0    0     2     0    2    1  0  0 100  0  0
 0  0      0 24450700    644  36152    0    0     0     0 1017  150  0  0 100  0  0
 2  0      0 24424908    784  60212    0    0 12318     0 1099  304  3  0 96  1  0
 2  0      0 24364052    848 120192    0    0 30224    40 1139  416 11  0 87  1  0
 2  0      0 24304600    912 179536    0    0 29648     0 1138  410 11  0 88  1  0
 2  0      0 24246444    968 237600    0    0 29228     0 1137  411 11  0 88  1  0
 2  0      0 24189060   1020 295976    0    0 28630     0 1132  390 11  0 88  1  0
 1  2      0 24129816   1064 353904    0    0 29436     6 1134  390 11  0 87  1  0
 2  0      0 24073940   1104 410124    0    0 27912     0 1129  405 11  0 88  1  0
 2  0      0 24016624   1140 465988    0    0 28556     0 1131  407 11  0 88  1  0
 2  0      0 23959072   1184 524148    0    0 28754     4 1135  412 11  0 88  1  0
 2  0      0 23901584   1204 581564    0    0 28568     0 1132  401 11  0 88  1  0
 2  0      0 23843364   1216 639748    0    0 29036     0 1133  399 11  0 88  1  0
 0  1      0 23743492   1676 738508    0    0 49406     6 1197  490  2  0 93  5  0
 0  0      0 23701168   1820 782056    0    0 21732     0 1123  347  0  0 96  3  0
 0  0      0 23701300   1828 781900    0    0     0    24 1018  155  0  0 100  0  0
 0  0      0 23701424   1828 781908    0    0     0     0 1014  151  0  0 100  0  0
 0  0      0 23701416   1828 781908    0    0     0     0 1018  155  0  0 100  0  0

BLASTN with data on local RAID0 drives, read-ahead 256, BLAST completes in 44 seconds
[root at cam-clu-01 testblast]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 24449624    664  35808    0    0     2     0    2    1  0  0 100  0  0
 0  0      0 24449872    664  36028    0    0     0     0 1017  157  0  0 100  0  0
 0  0      0 24449872    664  36028    0    0     0     0 1019  161  0  0 100  0  0
 1  1      0 24421508    796  64032    0    0 14038    40 1254  792  5  0 91  4  0
 1  1      0 24391328    824  93744    0    0 15080     0 1257  852  6  0 92  2  0
 1  1      0 24359532    856 126048    0    0 16022     0 1268  891  5  0 90  4  0
 1  1      0 24329476    876 155972    0    0 14946     4 1254  851  6  0 91  4  0
 1  1      0 24299776    900 185672    0    0 14880     0 1251  840  6  0 90  4  0
 1  1      0 24270904    936 214556    0    0 14432    12 1245  825  6  0 91  3  0
 1  1      0 24240948    968 244624    0    0 15018     0 1257  854  6  0 91  3  0
 1  1      0 24211732    996 273732    0    0 14568     0 1247  833  6  0 91  3  0
 2  0      0 24182448   1020 302824    0    0 14590     0 1249  841  6  0 91  4  0
 1  1      0 24153228   1052 331860    0    0 14598     0 1247  832  6  0 91  4  0
 2  0      0 24122556   1076 362432    0    0 15298     0 1267  870  6  0 92  2  0
 1  1      0 24095168   1088 389920    0    0 13654   100 1254  774  5  0 91  4  0
 1  1      0 24066904   1100 417856    0    0 14070     0 1239  800  6  0 92  2  0
 0  2      0 24038824   1116 445800    0    0 13916     2 1235  809  6  0 90  4  0
 1  1      0 24008820   1128 475476    0    0 14938     0 1251  837  6  0 92  2  0
 1  1      0 23979948   1136 504272    0    0 14434     0 1244  819  6  0 91  3  0
 2  1      0 23950004   1160 534244    0    0 14934     0 1254  846  6  0 91  4  0
 0  2      0 23920188   1172 563884    0    0 14872     0 1248  839  5  0 92  2  0
1  1      0 23891020   1180 592824    0    0 14484     0 1246  824  6  0 91  3  0
 1  1      0 23859216   1188 624540    0    0 15838     0 1267  887  6  0 91  4  0
 0  1      0 23829436   1316 654112    0    0 14706    10 1241  783  5  0 92  4  0
 0  1      0 23795088   1780 687504    0    0 17088     4 1290  686  0  0 94  6  0
 0  0      0 23791312   1780 692876    0    0  2556     0 1043  204  0  0 99  1  0
 0  0      0 23791324   1788 692876    0    0     0    26 1019  164  0  0 100  0  0
 0  0      0 23791436   1788 692876    0    0     0     0 1014  152  0  0 100  0  0

BLASTN with data from XServe via NFS, BLAST completes in 32 seconds
[root at cam-clu-01 testblast]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 24450716    620  36028    0    0     2     0    2    1  0  0 100  0  0
 0  0      0 24450728    620  36324    0    0     0     0 1015  154  0  0 100  0  0
 2  1      0 24419696    708  65760    0    0  1940     0 4349 4198  5  1 91  3  0
 2  0      0 24372632    716 112204    0    0     0    42 6924 7277  9  1 87  3  0
 3  0      0 24325280    716 159144    0    0     0     0 6935 7335  9  1 87  3  0
 1  1      0 24279788    716 204156    0    0     0     4 6740 7062  9  1 87  3  0
 3  0      0 24234508    716 249572    0    0     0     0 6766 7109  9  1 87  3  0
 2  0      0 24188752    716 294300    0    0     0     0 6722 7090  9  1 87  3  0
 2  0      0 24141920    732 339748    0    0    70     6 6850 7157  9  1 87  3  0
 2  0      0 24094916    732 386520    0    0     0     0 6952 7291  9  1 87  3  0
 1  1      0 24049868    732 431720    0    0     0     0 6753 7019  9  1 87  3  0
 2  0      0 24003308    740 478280    0    0     0     6 7003 7356  9  1 87  3  0
 1  1      0 23956744    740 524452    0    0     0     0 6937 7228  9  1 87  3  0
 2  0      0 23921464    740 559912    0    0     0     0 5552 5580  7  1 87  5  0
 0  2      0 23875432    740 605580    0    0     0     0 6850 7156  9  1 87  3  0
 1  0      0 23832364    740 648584    0    0    34    40 6531 6806  8  1 88  3  0
 0  1      0 23811748    740 669672    0    0     0     0 3548 2958  0  0 93  6  0
 0  1      0 23788460    748 690408    0    0    82    12 3732 3263  0  0 93  6  0
 0  0      0 23791388    748 691296    0    0    64     0 1025  162  0  0 100  0  0

Re-running with data already in RAM from previous run, BLAST completes in 11 seconds
[root at cam-clu-01 ~]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 23701684   1688 781800    0    0     2     0    2    1  0  0 100  0  0
 0  0      0 23701968   1688 781836    0    0     0     0 1018  157  0  0 100  0  0
 2  0      0 23700952   1688 781836    0    0     0     0 1016  166 11  0 89  0  0
 2  0      0 23701220   1696 781836    0    0     0    44 1017  153 12  0 87  0  0
 2  0      0 23701204   1696 781836    0    0     0     0 1015  146 12  0 87  0  0
 2  0      0 23701228   1696 781836    0    0     0     0 1018  149 12  0 87  0  0
 2  0      0 23701104   1696 781836    0    0     0     0 1015  147 12  0 88  0  0
 0  0      0 23701880   1696 781836    0    0     0     0 1017  159  6  0 94  0  0
 0  0      0 23702004   1696 781836    0    0     0    22 1019  153  0  0 100  0  0

Justin


-----Original Message-----
From: bioclusters-bounces at bioinformatics.org [mailto:bioclusters-bounces at bioinformatics.org] On Behalf Of Georgios Magklaras
Sent: 20 February 2010 20:13
To: HPC in Bioinformatics
Subject: Re: [Bioclusters] BLAST speed mystery

In general, I tend to use iozone (http://www.iozone.org/) to measure 
IOPS before I put cluster nodes into production. I assume that your 
BLAST versions between the G4 and the new server (Linux?) environment 
are the same.

Doing a vm_stat (on MACOSX) and vmstat (Linux) during the BLAST op (both 
precached and with est_mouse cached) can give you rough figures of disk 
throughput and buffer cache (yes, having more stripes is useful, but 
something else might be happening)

However, it would be useful to give us software (OS/kernel version) and 
hardware (RAID controller) versions on your new servers.

GM





More information about the Bioclusters mailing list