This page last modified: Aug 06 2009
keywords:blast,BLAST,ncbi,memory,allocate,error,query,database,library,refseq,nr,nt,fasta,formatdb,volume,blastall,tblastx description:Cause and workaround for NCBI BLAST "Failed to allocate" error. title:BLAST "Failed to allocate" error I have not seen The BLAST segment fault or memory allocation bug with 64 bit executables (of course running on a 64 operating system). With very large libraries such as refseq genomic, loading the full search library takes more than 2GB of RAM (the address limit for 32 bits). The message appears to refer to the amount of additional RAM that can't be allocated, rather than the total amount currently used or currently requested. A partially effective workaround is to only use one "volume" of a database. (search library). For instance use refseq_genomic.00 or refseq_genomic.01. The workaround helps with blastn, but only helps in half the cases with tblastx. tblastx seems to segment fault about half the time when searching against refseq_genomic, even when only searching one section of the NCBI Blast library. Not surprisingly, the size of the NCBI volumes is around 1GB (compressed). Subsequent testing showed that using a single, smaller library volume, roughly 400MB, did not cause the crash. I downloaded all of refseq genomic (aka "complete"), and built my own Blast search library with commands such as: cat *.fna > complete.fasta /bioinfo/blast/bin/formatdb -i complete.fasta -p F -o T -n refseq_genomic -v 1600 (On my hardware, I think the cat command took 30 minutes. The resulting file "complete.fasta" was around 105GB.) In all my test cases, the query sequence is short. ulimit is unlimited. The problem occurs only 32 bit systems. With further testing, we have found that using the 64 NCBI Blast executables with a 64 bit operating system do not have this bug. Instead of using one command with -d such as: /my_libs/refseq_genomic_20090630/refseq_genomic run several searches where -d is: /my_libs/refseq_genomic_20090630/refseq_genomic.00 /my_libs/refseq_genomic_20090630/refseq_genomic.01 /my_libs/refseq_genomic_20090630/refseq_genomic.02 A work around would be to set up the Perl script you use to run BLAST to iterate over the set of database volumes. (NCBI is not clear that .00, .01, .02, etc. are "volumes", but the word "volume" is used in the help text for formatdb, so I'll go with that nomenclature.) Below are typical command lines and error messages. zeus is 32 bit Linux. hera is 64 bit Apple OSX running the 32 bit NCBI executable (As of this writing there is no 64 bit OSX executable from NCBI.) [zeus ~]$ /bioinfo/blast-2.2.18/bin/blastall -i blast_input.fasta -p blastn -F "T" -m 7 -T T -e 1e-10 -v 50 -b 50 -M BLOSUM62 -A 0 -a 1 -d "/search_libs/refseq_genomic_20090630/refseq_genomic" [blastall] FATAL ERROR: Failed to allocate 435153695 bytes [hera ~]$ /bioinfo/blast-2.2.18/bin/blastall -i blast_input.fasta -p blastn -F "T" -m 7 -T T -e 1e-10 -v 50 -b 50 -M BLOSUM62 -A 0 -a 1 -d "/search_libs/refseq_genomic_20090630/refseq_genomic" blastall(55755) malloc: *** mmap(size=435154944) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug [NULL_Caption] FATAL ERROR: Failed to allocate 435153695 bytes [hera ~]$