[Bioclusters] mpiBLAST and NCBI Blast? BioseqFindFunc: couldn't uncache

Dan Bolser bioclusters@bioinformatics.org
Wed, 8 Sep 2004 12:56:11 +0100 (BST)


Hi, 

I am running what I think is regular NCBI blast from the blast toolbox. I
am using blastall with the '-a 4' option on a xeon machine. 

I see the 'bug' 

[blastall] ERROR: ncbiapi [000.000]  lcl|...: BioseqFindFunc: couldn't uncache

for which google turns up the following page...
http://mpiblast.lanl.gov/README-1.2.0.html

Also turns up one page from someone reporting the bug on pdb-l (with no
replies)...
http://vivaldi.bio.bnl.gov/asda/bb/archive/pdb-l/pdb-l.200301/1271.html

Significantly he is running 'in cluster'.


Basically I would like to know if I am or could be running mpiBLAST
without knowing it, and all these bugs traceback to mpiBLAST, or if
mpiBLAST is reporting (in the web page above) a bug which traces back to
BLAST on multi CPU archetecture.

I get "blastall 2.2.8".


Anybody know if this 'bug' is serious? Can I ignore it?

FYI here are some of the ~100 sequences which repeatedly throw up this
fault...

>lcl|6812 unnamed protein product
DKDVAKRLAEFAGIPVAPYRVLTRKAFRVSSLAKAVEGLSLPVFVKPCNMGSSVGIHKVKTQDALEAALDDAFRYDVKVL
VQQGIDAREIEVAVLEDETLFASLASELNPNAHHEFYSYEAKYLDPDGARVDLPARLDAAQMERVRSLATRVFAALECSG
FARVDFFLDRTGEFCFNEINTLPGFTSISMYPKMMEASGVPYGELLSRLVDLALDRHRQRQ
>lcl|2637 unnamed protein product
PLITPPHIKPEWYFLFAYAILRSIPNKLGGVLALIMSILILAILPLLQTTKQRSMVFRPFSQIMFWTLTADLFTLTWIGG
QPVEYAFVIIGQIASILYFSLILIIMPTVSLIENNMLKW
>lcl|6213 unnamed protein product
GRLRVGLLFGGRSREHEVSVVSAAAIAQAFNGDRYEVIPIYIEKDGRWRHTERVGIPEGTAPESSLWQFPAVVDTIDVWF
PIVHGPNGEDGTIQGLLELMQRPYVGSGVAASAIGM
>lcl|13053 unnamed protein product
IISFLNKLTTSNKTPKLVKGLINKLGLSYQENTDETISFAIHKGEIFAIAGVEGNGQSQLVNLICGIEKAASNKLIFNNI
DISRWSIRKRNAGISFVLEDRGLILQTVRFNTVNNQINNRSWNFLKPMEIALYSNTIIKKFDVRGSAEGSAVVRRLSGGN
QQKLIIGREMTKQNDLLVLAQVTRGLDIGAIAFIHENILLAKANKAILLVSYELDEIALADTVAVINKGRIVGMGKRD
>lcl|10188 unnamed protein product
MGAVLINENDEVMFFNPAAEKLWGYKREEVIGNNIDMLIPRDLRPAHPEYIRHNREGGKARVEGMSRELQLEKKDGSKIW
TRFALSKVSAEGKVYYLALVRDASVEMAQKEQTRQL
>lcl|5266 unnamed protein product
PLVTPQHIKPEWYFLFAYGILRSIPNKLGGTLALLMSVMILTTTPFTHTSRIRSMTFRPLTQTLFWLLIATFITITWTAT
KPVEPPFIFISQMASIIYFSFFIINPILGWAENKMQ
>lcl|2757 unnamed protein product
DRERFQAAVERLGLLQPQNATVTAMEQAVEKSREIGFPLVVRPSYVLGGRAMEIVYDEQDLRRYFNEAVSVSNESPVLLD
RFLDDATEVDIDAICDGERVVIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDKMREQVEKLAFELGVRGLMNTQFAVK
DNEVYLIEVNPRAARTVPFVSKATGAPLAKIAARVMAGQSLESQGFTKEIIPPYYSVKEVVLPFNKFPGVDPLLGPEMRS
TGEVMGVGATFAEAYAKAELGC
>lcl|13696 unnamed protein product
KLRSKLLWQGAGLPVAPWVALTRAEFEKGLSEEQKARISALGLPLIVKPSREGSSVGMTKVVEENALQGALSLAFQHDDE
ILIEKWLCGPEFTVAIVGEEILPSIRIQPAGTFYDYEAKYLSDETQYFCPAGLEASQEAALQSLVLQAWKALGCTGWGRI
DVMLDSDGQFYLLEANTSPGMTSHSLVPMAARQAGMSFSQLVVRILE

The 'unnamed' business is due to fastacmd not parsing my fasta header
format properly, and isn't unique to the sequences which throw up the
errors.

Thanks in advance for any sugestions, 

Dan.