There is nothing changed in this port except:- i) comment out the Altivec prefetch code ii) remove Apple version of Altivec'd BlastNtFindWords and restore the original 2001/12/20 version I have tested it on RedHat 9. You may have to fiddle with the make system on whatever platform you use to convince it not to mess around with MOTIF. It is still based on the 2001/12/20 version of NCBI blast. It should relatively trivial to move the changes into the latest version. Get it at: http://gadfly.gen.cam.ac.uk/~davidh/ncbi-ag-x86.tgz I no longer have access to the G4 I used for verification to check that the version I have left here gives identical outputs to the G4 version - it should. The results will not be identical to the NCBI 2001/12/20 version as the order in which hits are found is different in the Apple code. I would appreciate that someone checks that it still does. And that someone attempts to get performance numbers across a range of platforms. The reason for improved performance is simple: the NCBI version redundantly expands hits it has already logged earlier other than at a default wordsize of 11. NCBI Blastn uses an initial comparison to determine if it should attempt to extend the match to the required wordsize. A match here will cause it to attempt to look up and extend against all positions in its hash table to reach the target wordsize. Success on that triggers a final extension attempt to get as far as it can. The extension proceeds on either a byte-vs-byte manner and terminates with a base-vs-base manner. base-base extensions are made 3-symbols to left and right. The problem is at low wordlength (< 11), NCBI Blastn will use a 4-base initial comparison (1 byte). Evidently this will generate a humongous number of almost certainly unsuccessful extension attempts against the hash table positions. At wordlength>10, Blastn uses an 8-base initial comparison. For wordlength=11, this will result in that being followed by 3-base extensions to the left and right. Should the range achieved on summing the left and right extensions reach the amount required to get 11, it will then do the full extension attempt. After doing this for all matching entries in its hash table, it will advance the search position by 4 bases (1 byte). For wordlengths > 11, the initial match is still done with 8-bases but the subsequent extension would be performed by adding additional bytes (4-base) searches in the forward direction. The final 3-base extension to left and right is still performed. The advance is still by a single byte. If I have analysed their code correctly, the flaw lies in the advance not being changed appropriately with large wordsizes. It is easily demonstrated that with increasing wordsizes, the probability of redundantly exploring hits that already have been explored from the previous advance cycle increases sharply. I think this is why its performance gets trashed so badly by AG Blast in large wordsizes. The logic of optimal stepsize is simple although the need to quantise to byte/word/etc boundaries complicates it slightly. Nevertheless, the strategy is optimal at wordsize=11 (which makes me wonder which was chosen first, the implementation or the optimal wordsize ;-) ). The Apple code is capable of tuning its initial matchsize and stepsize to match the wordsize requirements. This has benefits at wordsizes lower and higher than 11. At lower wordsizes, it means you can avoid using the initial matchsize of 4. At higher wordsizes, you can now step thru at a larger stepsize. So this is why I think the AG code works better and on all platforms. I am disappointed that they didn't just say so. As Ian Korf correctly pointed out, wordsizes < 11 are useful. Regards, David Huen