topic How to investigate crashing indexer in Getting Data In

How to investigate crashing indexer

shivanandbm — Fri, 02 Aug 2019 19:14:32 GMT

Our indexers are in a cluster. We have 4 indexers and they are crashing once a week, I do not how to start investigating.
I tried several ways but not able to identify anything with that.

1) Whenever they crash, CPU load and memory and swap will be exhausted. It only happened during the crash time most of the time they are so less utilized.

free -g
total used free shared buffers cached
Mem: 11 10 0 0 0 9
-/+ buffers/cache: 0 10
Swap: 3 0 3

2) Also, we are seeing one accelerated search the is getting completed during this time.

Just let me know if any there is any info I should provide.

Please help us with this.

Re: How to investigate crashing indexer

mayurr98 — Fri, 02 Aug 2019 20:41:59 GMT

There are some helpful searches in case of crash:

1) When did Splunk last crash?

index=_internal sourcetype=splunkd_crash_log | stats count by host

2) Show me all Splunk restarts based on loader?

index=_internal sourcetype=splunkd loader message=*xml

3) Lengthy search?

index="_audit" action="search" (id=* OR search_id=*) | eval
user=if(user=="n/a",null(),user) | stats max(total_run_time)
as total_run_time first(user) as user by search_id | stats
count perc95(total_run_time) median(total_run_time) by user

You might need to check Monitoring Console >> Resource Usage to check memory and disk space usage within crash time period.

Re: How to investigate crashing indexer

shivanandbm — Sat, 03 Aug 2019 05:00:12 GMT

Thanks alot for replying. I am not getting any output in crash search and restart based on loader search.
Splunk system user are the top performer during that time.
Also memory and swap are completely exhausted and then our splunk process stopped in the indexer..I restarted manually.

just want to know why the memory and swap are getting exhausted for short duration of time. also i see high load during that time.

also i had seen all the forwarders are disconnected during that time as i got forwarder missing alerts for so many forwarders.

I am not seeing anything in crashlog..

Our splunk process stopped and i restarted it manually.

Regards,Shivanand