Getting Data In

How to investigate crashing indexer

shivanandbm
New Member

Our indexers are in a cluster. We have 4 indexers and they are crashing once a week, I do not how to start investigating.
I tried several ways but not able to identify anything with that.

1) Whenever they crash, CPU load and memory and swap will be exhausted. It only happened during the crash time most of the time they are so less utilized.

free -g
total used free shared buffers cached
Mem: 11 10 0 0 0 9
-/+ buffers/cache: 0 10
Swap: 3 0 3

2) Also, we are seeing one accelerated search the is getting completed during this time.

Just let me know if any there is any info I should provide.

Please help us with this.

0 Karma

mayurr98
Super Champion

There are some helpful searches in case of crash:

1) When did Splunk last crash?

index=_internal sourcetype=splunkd_crash_log | stats count by host

2) Show me all Splunk restarts based on loader?

index=_internal sourcetype=splunkd loader message=*xml

3) Lengthy search?

index="_audit" action="search" (id=* OR search_id=*) | eval
user=if(user=="n/a",null(),user) | stats max(total_run_time)
as total_run_time first(user) as user by search_id | stats
count perc95(total_run_time) median(total_run_time) by user

You might need to check Monitoring Console >> Resource Usage to check memory and disk space usage within crash time period.

0 Karma

shivanandbm
New Member

Thanks alot for replying. I am not getting any output in crash search and restart based on loader search.
Splunk system user are the top performer during that time.
Also memory and swap are completely exhausted and then our splunk process stopped in the indexer..I restarted manually.

just want to know why the memory and swap are getting exhausted for short duration of time. also i see high load during that time.

also i had seen all the forwarders are disconnected during that time as i got forwarder missing alerts for so many forwarders.

I am not seeing anything in crashlog..

Our splunk process stopped and i restarted it manually.

Regards,Shivanand

0 Karma
Get Updates on the Splunk Community!

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...