Splunk Search

How to count the number of times Splunk is restarted.

kamal_jagga
Contributor

Hi,

I want to create a metrics of Count of the following things.
1. Splunk restarts done from UI.
2. Splunkd restarts done.
3. Splunk Forwarder restarts.

Kindly advise.

Tags (3)

yannK
Splunk Employee
Splunk Employee

to see if splunk is up, I look at the events

index=_internal source=*splunkd.log*  "(build"

but sometimes, you can see 2 close events for a single restart. So if you want the exact count you can add add a bucket per minute and dedup.

kamal_jagga
Contributor

Hey,

How can i add a bucket of a min or 2 and dedup.

Would you be able to give the exact query string.

0 Karma

kamal_jagga
Contributor

Also,
When i used your query, i found some extra events also. So, i modified it to the below one.

index=_internal source=splunkd.log "Splunkd starting (build 245427)."

Now, would you be able to suggest how to count the number of events that come from this search.

Thanks

splunker12er
Motivator

index=_internal source=splunkd.log "(build"| timechart span=1m values(_raw) as Event

OR

index=_internal source=splunkd.log "(build"| dedup _raw|table _raw,_time

0 Karma

fdi01
Motivator

you seeing many action=restart_splunkd messages from your "_audit" index .

try like:

index="_audit" host="host_you_want" | stats count(eval(action="restart_splunkd")) as "number of times Splunkd is restarted"

...

dsmc_adv
Path Finder

I downvoted this post because it doesn't return all the values

dsmc_adv
Path Finder

This search does not return valid results for me, the one from yannk it does

0 Karma

kamal_jagga
Contributor

Thanks it provided me the count.
But i found that there are multiple entries for the same restart with gap of milliseconds.
Query :index="_audit" action="restart_splunkd"

Is it possible that we can put some filter/condition where it counts the event only if there is a gap, say 2 mins between them.

0 Karma

fdi01
Motivator

if you ok for answer, you can vote up or accept answer.
to see where it counts the event only if there is a gap, say 2 mins between them.
see what Mr jeffland: doing down in comment.
and you can add by _time in stats to more see.
ex: index="_audit" action="restart_splunkd" | bucket _time span=2m | dedup _time | stats count as "number of times Splunkd is restarted" by _time

thank. Mr kamal_jagga

0 Karma

jeffland
SplunkTrust
SplunkTrust

That's exactly what yannK suggested - bucketing time and deduping. Going from the above search, that would be

index="_audit" action="restart_splunkd" | bucket _time span=2m | dedup _time | stats count as "number of times Splunkd is restarted"

(Or you leave action="restart_splunkd" in stats count, however you prefer it - although this should be faster)

MichaelPriest
Communicator

Have a look in the audit index, index="_audit" and look at the action field

kamal_jagga
Contributor

Also, when i see the results of the following query.

index=_internal splunkd.log "start"

I see 2 source types.
sourcetype=splunkd_remote_searches
and other
sourcetype=splunkd coming from splunkforwarder.

is this the standard format.
And which one is for the splunkd.

And also want to know how to find out the splunk restarts done from Splunk UI. (I just not restarted splunk from UI and don't see that in the above mentioned splunkd restarts)

Kindly advise.

0 Karma

kamal_jagga
Contributor

Also, how to find out the metrics for search head restarts.

Any help is appreciated.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...