Reporting

Searches fail with "bad allocation" error

redc
Builder

I have a number of saved searches set up which are running on nightly cron jobs. I'm attempting to stagger the schedules enough so that not too many saved searches are running at the same time. However, some searches periodically fail with a "bad allocation" error. There doesn't seem to be any other information about what this error means.

This typically happens on somewhat larger searches. For example, I have a dashboard that displays the "Last 7 days", "Last 30 days", and "Last 3 months" of a data set. Each view contains between 150,000 and 1 million events; I've set up a scheduled search to run once a day for each of the three views for performance reasons.

The "Last 7 days" search (~150,000 events) runs just fine, averaging about 45 seconds to complete. The "Last 30 days" (500,000 events) occasionally encounters the "bad allocation" error and averages 2 minutes to complete. The "Last 3 months" (1 million events) encounters this error fairly frequently after about 2 minutes.

I can often run the same search ad-hoc and it will complete just fine.

It doesn't seem likely that this is a timeout issue because I have other saved searches that take upwards of 5 minutes to run and they consistently complete just fine.

Does anyone know what the "bad allocation" error indicates? Does it mean my server doesn't have enough memory/CPU to process the data? Does it indicate a problem with the hardware itself (i.e., a bad disk or memory allocation)?

1 Solution

redc
Builder

I finally managed to catch this in the act. My search head was running out of memory. Boosted its memory capacity, no more "bad allocation" errors. Until I overreach its current capacity, anyway!

View solution in original post

GSK
Explorer

Hello All

I am having the same issue "BAD Allocation" we have increased our RAM from 8GB to 24 GB recently and still having this issues. We are on VM machine , Does that impact in anyways. Below are my limits.conf settings . Please throw some light on this issue.

Systems Configuration
8 Core 2,67 GHZ
24GB Ram

Limits.conf

[search]
dispatch_dir_warning_size = 5000
max_searches_per_cpu=3
max_mem_usage_mb = 400

fetch_remote_search_log = false
ttl = 3600
replication_period_sec  = 2400
replication_file_ttl = 1600
sync_bundle_replication = 0
multi_threaded_setup = true

[metadata]
maxcount = 400000

[search_metrics]
max_rawsize_perchunk = 10000000

[lookup]
max_memtable_bytes = 30000000
0 Karma

redc
Builder

The way running on a VM affects your searches is that VMs are noticeably slower than physicals. I never tested a Splunk server VM vs. physical, but with another software with similar hardware requirements to Splunk, we found running on a physical server tripled processing speed over a virtual with the same specs. If you have lots of scheduled searches that are running sequentially or close together, they may take enough longer on a VM to end up stacking on top of each other (running simultaneously) and can run the server out of memory. I spent two days adjusting our search schedules (some 1,500 scheduled searches) to prevent them from stacking on each other too much.

Are you running on a Windows server or a Unix system? Splunk running on Windows will be slower than on Unix; Windows servers may leak memory, as well, which are two of of the reasons we ditched our Windows installation and went to a Unix installation. We experienced at 50% performance improvement and stopped having the memory leak problem.

Finally, do you do any inputlookup commands loading CSV files? The CSV file will have to be fully loaded into memory before the search can proceed, so even if you have your max_mem_usage_mb set low, the search can still consume more memory by loading the CSV into memory.

0 Karma

redc
Builder

I finally managed to catch this in the act. My search head was running out of memory. Boosted its memory capacity, no more "bad allocation" errors. Until I overreach its current capacity, anyway!

scottyp
Observer

So I have a stand alone splunk instance with only data that is imported from botsv3 and I used these instructions for  Adjusting Splunk memory in settings: You can allocate more memory to Splunk by adjusting the settings in the limits.conf file. Locate this file in the Splunk installation directory and modify the max_mem setting to allocate more memory. This file typically resides in SPLUNK/etc/system/local/limits.conf. 
I changed
max_mem = <new value>MB

And so far changing the max_mem from the original 200 mb to 6,144 mb to make it 6gb for splunk to use, it seems like I do not have the bad allocation issue anymore. I will continue monitoring for the error and update my comment if I run into the bad allocation error again. This solution may not work for everyones specific situation especially since you may enter an organization and the memory allocation has already been configured and you may not have permissions to change any configurations but if you are working just with a home lab and you are making your own configurations as the splunk admin this is a good place to start.

Since none of the solutions seem to actually provide steps on how to make the actual adjustments for people that are learning I figured I would include some descriptive steps to this discussion so people can contribute their expertise for people that are learning. Please build on the discussion with actionable steps instead of replying that this solution may not work so people can actually learn what the solution steps are.

0 Karma

redc
Builder

I agree with martin.

To explain it in a nutshell, the search results (the events) have to be stored in memory in order for the additional commands to be run against them (e.g., stats, eval, chart, table, etc.). I'd guess that if your subsearch is returning 5,000,000 records, than your primary search is probably returning at least that many, as well.

If you could update your own "bad allocation" thread (http://answers.splunk.com/answers/130279/searches-are-failing-with-bad-allocation-error) with your search, I'd be happy to take a look at it and see what kind of refactoring can be done.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Storing up to five million events for each subsearch - I'm not surprised your server is on its knees. Usually it's possible to refactor your searches to not require that many subsearch results, that will not only help your memory but also your overall performance.

ncbshiva
Communicator

Hai, Thanks for your answer redc.

My splunk server is quad core with 16GB RAM.
I am getting this Bad allocation when i will change limits.conf file.

I am changing the below parameters

[searchresults]
maxresultrows = 5000000

[subsearch]
maxout = 5000000

I was getting message like subsearch has been truncated to 50000 results.So i have increased parameters in limits.conf , after increasing i am getting Bad allocation.

Could you please help me in this ..?

0 Karma

redc
Builder

Correct. In my case, I increased our server from 8GB to 16GB of memory.

Realistically 8GB really isn't sufficient for more than the most basic of searches by a single user, but we were just starting to ramp up our usage and I'd forgotten how small we'd started initially. I'm sure we'll have to increase it again.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In other words, you added more physical/virtual memory to the machine rather than changed some configuration within Splunk?

0 Karma

redc
Builder

This will be your Splunk server. Depending on your setup, this may be a single server (one that runs both the indexer and the search head, which is how ours is configured) or multiple servers (if you run your search head(s) separate of your indexer(s)). Increasing memory on your Splunk server(s) requires rebooting the server(s).

You may also want to increase your CPUs; a typical setup ratio is 2GB of memory for every CPU on the server (so a 4CPU server would have 8GB of memory).

Your sysadmin (or whoever created the server for you to use) should be able to take care of this for you.

0 Karma

ncbshiva
Communicator

Hi,
I too facing the same problem. Could you let me know where we need\ to increase the search head memory?

Thanks in advance:-)

0 Karma

redc
Builder

Final line in the search.log:

03-25-2014 10:34:17.983 ERROR dispatchRunner - RunDispatch::runDispatchThread threw error: bad allocation
0 Karma

redc
Builder

Yeah...not very enlightening:

03-25-2014 10:34:13.739 INFO  UserManager - Unwound user context: admin -> NULL
03-25-2014 10:34:13.739 ERROR DispatchThread - bad allocation
03-25-2014 10:34:13.739 INFO  UserManager - Setting user context: admin
03-25-2014 10:34:13.739 INFO  UserManager - Done setting user context: NULL -> admin
03-25-2014 10:34:13.739 INFO  UserManager - Unwound user context: admin -> NULL
03-25-2014 10:34:13.739 INFO  DispatchManager - DispatchManager::dispatchHasFinished(id='1395761401.565', username='admin')
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Have you looked through the relevant search.log files?

0 Karma
Get Updates on the Splunk Community!

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...