Getting Data In

How to investigate crashing indexer

shivanandbm
Explorer

Our indexers are in a cluster. We have 4 indexers and they are crashing once a week, I do not how to start investigating.
I tried several ways but not able to identify anything with that.

1) Whenever they crash, CPU load and memory and swap will be exhausted. It only happened during the crash time most of the time they are so less utilized.

free -g
total used free shared buffers cached
Mem: 11 10 0 0 0 9
-/+ buffers/cache: 0 10
Swap: 3 0 3

2) Also, we are seeing one accelerated search the is getting completed during this time.

Just let me know if any there is any info I should provide.

Please help us with this.

0 Karma

mayurr98
Super Champion

There are some helpful searches in case of crash:

1) When did Splunk last crash?

index=_internal sourcetype=splunkd_crash_log | stats count by host

2) Show me all Splunk restarts based on loader?

index=_internal sourcetype=splunkd loader message=*xml

3) Lengthy search?

index="_audit" action="search" (id=* OR search_id=*) | eval
user=if(user=="n/a",null(),user) | stats max(total_run_time)
as total_run_time first(user) as user by search_id | stats
count perc95(total_run_time) median(total_run_time) by user

You might need to check Monitoring Console >> Resource Usage to check memory and disk space usage within crash time period.

0 Karma

shivanandbm
Explorer

Thanks alot for replying. I am not getting any output in crash search and restart based on loader search.
Splunk system user are the top performer during that time.
Also memory and swap are completely exhausted and then our splunk process stopped in the indexer..I restarted manually.

just want to know why the memory and swap are getting exhausted for short duration of time. also i see high load during that time.

also i had seen all the forwarders are disconnected during that time as i got forwarder missing alerts for so many forwarders.

I am not seeing anything in crashlog..

Our splunk process stopped and i restarted it manually.

Regards,Shivanand

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...