Defindit Docs and Howto Home

This page last modified: Aug 06 2009
keywords:blast,BLAST,ncbi,memory,allocate,error,query,database,library,refseq,nr,nt,fasta,formatdb,volume,blastall,tblastx
description:Cause and workaround for NCBI BLAST "Failed to allocate" error.
title:BLAST "Failed to allocate" error

I have not seen The BLAST segment fault or memory allocation bug with
64 bit executables (of course running on a 64 operating system).

With very large libraries such as refseq genomic, loading the full
search library takes more than 2GB of RAM (the address limit for 32
bits). The message appears to refer to the amount of additional RAM
that can't be allocated, rather than the total amount currently used
or currently requested.

A partially effective workaround is to only use one "volume" of a
database.  (search library). For instance use refseq_genomic.00 or
refseq_genomic.01. The workaround helps with blastn, but only helps in
half the cases with tblastx. tblastx seems to segment fault about half
the time when searching against refseq_genomic, even when only
searching one section of the NCBI Blast library. Not surprisingly, the
size of the NCBI volumes is around 1GB (compressed). 

Subsequent testing showed that using a single, smaller library volume,
roughly 400MB, did not cause the crash. I downloaded all of refseq
genomic (aka "complete"), and built my own Blast search library with
commands such as:

cat *.fna > complete.fasta
/bioinfo/blast/bin/formatdb -i complete.fasta -p F -o T -n refseq_genomic -v 1600

(On my hardware, I think the cat command took 30 minutes. The
resulting file "complete.fasta" was around 105GB.)

In all my test cases, the query sequence is short.

ulimit is unlimited. The problem occurs only 32 bit systems.

With further testing, we have found that using the 64 NCBI Blast
executables with a 64 bit operating system do not have this bug.


Instead of using one command with -d such as:

/my_libs/refseq_genomic_20090630/refseq_genomic

run several searches where -d is:

/my_libs/refseq_genomic_20090630/refseq_genomic.00
/my_libs/refseq_genomic_20090630/refseq_genomic.01
/my_libs/refseq_genomic_20090630/refseq_genomic.02

A work around would be to set up the Perl script you use to run BLAST
to iterate over the set of database volumes. (NCBI is not clear that
.00, .01, .02, etc. are "volumes", but the word "volume" is used in
the help text for formatdb, so I'll go with that nomenclature.)


Below are typical command lines and error messages. zeus is 32 bit
Linux. hera is 64 bit Apple OSX running the 32 bit NCBI executable (As
of this writing there is no 64 bit OSX executable from NCBI.)


[zeus ~]$ /bioinfo/blast-2.2.18/bin/blastall -i blast_input.fasta -p blastn -F "T" -m 7 -T T -e 1e-10 -v 50 -b 50 -M BLOSUM62 -A 0 -a 1  -d "/search_libs/refseq_genomic_20090630/refseq_genomic"
[blastall] FATAL ERROR: Failed to allocate 435153695 bytes



[hera ~]$ /bioinfo/blast-2.2.18/bin/blastall -i blast_input.fasta -p blastn -F "T" -m 7 -T T -e 1e-10 -v 50 -b 50 -M BLOSUM62 -A 0 -a 1  -d "/search_libs/refseq_genomic_20090630/refseq_genomic"
blastall(55755) malloc: *** mmap(size=435154944) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
[NULL_Caption] FATAL ERROR: Failed to allocate 435153695 bytes
[hera ~]$