[Bioclusters] Opteron Perl64 segfault issues
Nathan O. Siemers
bioclusters@bioinformatics.org
Thu, 21 Aug 2003 16:46:10 -0400
All:
Joe Landman from Scalable Informatics, Lawrence Hannon from IBM, and I
have been working on issues running blast on the AMD opteron platform.
I've summarized my results (with much help from Joe and Lawrence) in
validating the blastall and formatdb code. There are quirks with the
latest versions of the NCBI toolkit, producing corrupt blast results in
some situations. They only appear with some (large) databases but we
are not sure what exactly causes this behavior at the present time. We
have tentative workarounds, listed below.
Thanks to everyone who has helped me over the past few weeks - the
bottom line is that *none* of the problems I have seen over the past
weeks could actually be traced to problems with Opteron hardware (other
than a RAM chip) or Linux OS. This is great news for Opteron.
SUMMARY
Builds of formatdb and blastall from the NCBI Toolkit version 2.2.6
can produce corrupted output when used with some formatdb parameters
in all builds so far tested on the AMD Opteron 64 bit platform.
Symptoms include failure to produce a correctly named .nal or .pal
file when databases are split up into volumes. Pointer errors produce
incorrect results and alignments with some large databases. NCBI
Toolkit 2.2.1 does not show this behavior. Some of these errors have
been reproduced by us on SGI MIPS IRIX platforms with SGI compilers,
suggesting that the errors are neither Opteron nor compiler specific.
Current workarounds are to:
1. explicitly name the formatdb output database with the -n option
2. use the '-o T' option in formatdb to alter the way blast indices
are created.
Alternatively:
3. Use the 2.2.1 version of the blastall tools.
_______________________________________
TESTS
Machine, OS, libs:
2 CPU AMD Opteron (Penguin), 6G RAM, SUSE Linux 8, 2.4.19 SMP Linux
Kernel.
Current configuration:
opt:/gcgblast # gcc -v
Reading specs from /usr/lib64/gcc-lib/x86_64-suse-linux/3.2.2/specs
Configured with: ../configure --enable-threads=posix --prefix=/usr
--with-local-prefix=/usr/local --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64
--enable-languages=c,c++,f77,objc,java,ada --enable-libgcj
--with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib
--with-system-zlib --enable-shared --enable-__cxa_atexit x86_64-suse-linux
Thread model: posix
gcc version 3.2.2 (SuSE Linux)
(gcc-3.2.2-26.x86_64.rpm)
(glibc-2.2.5-184.x86_64.rpm)
ldd /usr/local/bin/blastall:
libm.so.6 => /lib64/libm.so.6 (0x0000002a9566d000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000002a957c6000)
libc.so.6 => /lib64/libc.so.6 (0x0000002a958e2000)
/lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2
(0x0000002a95556000)
_______________________________________
Databases:
ncbi: Human genome scaffold broken into 100KB pieces, 50KB overlap (
5.9G )
sncbi: same as above but long sequence names converted to shorter form
(some names were very long and I wanted to make sure this was not an
name indexing problem)
htg: 20 August download of NCBI htg sequence file (11G uncompressed)
_______________________________________
Formatdb options:
o: using '-o T' option for indexing
no_o: no -o option
Other formatdb options used: '-p F -n <name> -i <fasta_file>'
_______________________________________
blastall options: '-p tblastn -v 3 -b 3 -a 2 -d <db> -i <input_file>'
_______________________________________
Input file: 12 protein sequences from fly refseq:
>BMSPROT:NP_478140
>BMSPROT:NP_523807
>BMSPROT:NP_609725
>BMSPROT:NP_524716
>BMSPROT:NP_524665
>BMSPROT:NP_524468
>BMSPROT:NP_523392
>BMSPROT:NP_572997
>BMSPROT:NP_524671
>BMSPROT:NP_608480
>BMSPROT:NP_524763
>BMSPROT:NP_524817
(I've checked, the 'BMSPROT:' prefix doesn't seem to affect the analysis).
_______________________________________
R E S U L T S
____________________________________________________________________
NCBI Toolkit ncbi-o ncbi-no_o sncbi_o sncbi-no_o htg-o htg-no_o
2.2.1 pass pass pass pass pass pass
2.2.6 pass FAIL* pass FAIL* pass pass
____________________________________________________________________
* - FAIL symptoms include error messages: '[blastall] ERROR: ncbiapi
[000.000]
BMSPROT:NP_478140: ObjMgrChoice: pointer [0] type [1] not found',
missing names for
sequence names of db hits in BLAST summary and sporadic nonsense alignments.
CONFIGURATION
IBM,Siemers Opteron linux.ncbi.mk directives for 2.2.6 (April 2003),
SUSE 8.1 opteron
Linux
NCBI_DEFAULT_LCL = lnx
NCBI_MAKE_SHELL = /bin/sh
NCBI_CC = gcc -pipe -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -O3
-DOS_UNIX_PPCLINUX -I../include -I/usr/X11R6/include -L/usr/X11R6/lib64
-DWIN_MOTIF
# should probably be /usr/X11R6/lib64 above on SUSE 8.1
NCBI_CFLAGS1 = -c
NCBI_LDFLAGS1 =
NCBI_OPTFLAG =
Opteron linux.ncbi.mk directives for 2.2.1 NCBI Toolkit:
NCBI_DEFAULT_LCL = lnx
NCBI_MAKE_SHELL = /bin/sh
NCBI_CC = gcc -pipe -D__USE_FILE_OFFSET64 -D__USE_LARGEFILE64
NCBI_CFLAGS1 = -c -DOS_UNIX_PPCLINUX
NCBI_LDFLAGS1 = -O2
NCBI_OPTFLAG = -O2