Reporting

How to list ad-hoc/scheduled searches in order of CPU usage.

danielwan
Explorer

I saw some CPU usage spike on my all-in-one Splunk server 6.5.x and would like to figure out which individual ad-hoc/scheduled search, e.g. search name, causes it in last 24 hours. How to figure it out?

P.S. some discussions spoke about "perfmon" index, but my Splunk server does not have that index.

0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

Based on Monitoring Console query, I have created below query which will give you max cpu percentage usage by search but it will only provide cpu percentage if search runtime is greater than 10 seconds. For searches which took less than 10 seconds to run, splunk is not ingesting pct_cpu field in _introspection index.

`dmc_set_index_introspection` search_group=dmc_group_search_head search_group="*" sourcetype=splunk_resource_usage data.search_props.sid::* data.search_props.mode!=RT             | `dmc_rename_introspection_fields`             | eval search_label = if(isnotnull(label), label, sid) | stats max(elapsed) as runtime max(mem_used) as mem_used max(data.pct_cpu) as pct_cpu earliest(_time) as _time by search_label, type, mode, app, role, user, host | eval mem_used = round(mem_used, 2)             | eval runtime = `dmc_convert_runtime(runtime)` | fields search_label, mem_used, pct_cpu, host, runtime, _time, type, mode, app, user, role             | eval _time=strftime(_time,"%+")             | rename search_label as Name, mem_used as "Memory Usage (MB)", pct_cpu as "CPU Percentage", host as Instance, runtime as Runtime, _time as Started, type as Type, mode as Mode, app as App, user as User, role as Role

I hope this helps.

Thanks,
Harshil

View solution in original post

0 Karma

harsmarvania57
Ultra Champion

Hi,

Based on Monitoring Console query, I have created below query which will give you max cpu percentage usage by search but it will only provide cpu percentage if search runtime is greater than 10 seconds. For searches which took less than 10 seconds to run, splunk is not ingesting pct_cpu field in _introspection index.

`dmc_set_index_introspection` search_group=dmc_group_search_head search_group="*" sourcetype=splunk_resource_usage data.search_props.sid::* data.search_props.mode!=RT             | `dmc_rename_introspection_fields`             | eval search_label = if(isnotnull(label), label, sid) | stats max(elapsed) as runtime max(mem_used) as mem_used max(data.pct_cpu) as pct_cpu earliest(_time) as _time by search_label, type, mode, app, role, user, host | eval mem_used = round(mem_used, 2)             | eval runtime = `dmc_convert_runtime(runtime)` | fields search_label, mem_used, pct_cpu, host, runtime, _time, type, mode, app, user, role             | eval _time=strftime(_time,"%+")             | rename search_label as Name, mem_used as "Memory Usage (MB)", pct_cpu as "CPU Percentage", host as Instance, runtime as Runtime, _time as Started, type as Type, mode as Mode, app as App, user as User, role as Role

I hope this helps.

Thanks,
Harshil

0 Karma

koshyk
Super Champion

Have a try for the long running searches. the below search is from DMC

   index=_audit host=* action=search sourcetype=audittrail search_id!="rsa_*" 
    | eval user = if(user="n/a", null(), user) 
    | eval search_type = case( match(search_id, "^SummaryDirector_"), "summarization", match(search_id, "^((rt_)?scheduler__|alertsmanager_)"), "scheduled", match(search_id, "\d{10}\.\d+(_[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12})?$"), "ad hoc", true(), "other") 
    | eval search=if(isnull(savedsearch_name) OR savedsearch_name=="", search, savedsearch_name)
    | stats min(_time) as _time, values(user) as user, max(total_run_time) as total_run_time, first(search) as search, first(search_type) as search_type,first(apiStartTime) as apiStartTime, first(apiEndTime) as apiEndTime by search_id
    | where isnotnull(search) AND true() | search user="*"
    | fields search, total_run_time, _time, apiStartTime, apiEndTime, search_type, user
    | sort - total_run_time

Have a try and let us know

0 Karma

danielwan
Explorer

It seems to list all long-execution searches in "other" type.

My first question is long -execution search is high CPU usage search? I am looking for searches eating my CPU.

In addition, is total_run_time is the cumulative execution time? I have some real-time and scheduled searches, for these searches, cumulative execution time may not help much.

And I notice the search type in the output is "other", do these searches compose the "other" part of "CPU usage process" pie chart in the Settings>Monitoring console>Overview? Are they equal to searches marked as N/A in "Maximum Resource Usage of Searches" graph in Settings>Monitoring console>Search>Search Activity: Instance?

0 Karma

koshyk
Super Champion

If your splunk setup is default settings, max_searches_per_cpu=1. So more the time a search spends, it uses that cpu to more extend. The long running search is perfect way to diagnose the high cpu.

Yes. total_run_time is cumulative execution time. REal-time searches are bit different as the time shown in the above search is start to stop of a real-time search. So the value will be high and would suggest them to be ignored in your analysis.

Please vote, if the answer helped you. cheers

0 Karma

lloydknight
Builder

what servers are you monitoring? what do you mean by all-in-one Splunk? is that a single instance Splunk? if you're using an add-on to monitor some servers, index=os is for unix systems and index=perfmon is for windows systems

0 Karma

danielwan
Explorer

all-in-one means the single Splunk running as both search head and indexer.
I am not going to monitor any remote host, instead, I want to find out which search running on my single instance Splunk is eating CPU of the single instance Splunk.
My single instance Splunk does not have the index called "os" or "perfom"

0 Karma

lloydknight
Builder

you can view most of the details concerning your Splunk Instance in the Settings > Monitoring Console Just drilldown on the CPU Usage and Memory Usage and from there, you can see the historical Resource Usage of your Splunk Instance.

0 Karma

danielwan
Explorer

The CPU usage chart in Settings > Monitoring Console seems to be a snapshot of the current searches' resource usage.

When I clicked the search part on the CPU usage pie chart, it directs me to the following search

| rest splunk_server=local /services/server/status/resource-usage/splunk-processes | eval sid = 'search_props.sid' | eval process_class = case( process=="splunk-optimize","index service", process=="sh" OR process=="ksh" OR process=="bash" OR like(process,"python%") OR process=="powershell","scripted input", process=="mongod", "KVStore") | eval process_class = case( process=="splunkd" AND (like(args,"-p %start%") OR like(args,"service")),"splunkd server", process=="splunkd" AND isnotnull(sid),"search", process=="splunkd" AND (like(args,"fsck%") OR like(args,"recover-metadata%") OR like(args,"cluster_thing")),"index service", process=="splunkd" AND args=="instrument-resource-usage", "scripted input", (like(process,"python%") AND like(args,"%/appserver/mrsparkle/root.py%")) OR like(process,"splunkweb"),"Splunk Web", isnotnull(process_class), process_class) | eval process_class = if(isnull(process_class),"other",process_class)| search process_class=search | eval x="cpu_usage"

I would like to see the top searches consuming CPU in the past range of time, e.g. in the last recent days.

I have checked out the Settings>Monitoring Console>Search Activity:Instance>Maximum Resource Usage of Searches. The search name which consumes over half of CPU cores is N/A. What does N/A mean here? How can I see the real search name?

0 Karma

lloydknight
Builder

Hello @danielwan

try this app instead
https://splunkbase.splunk.com/app/3760/

This will monitor your Splunk Searches and Schedules and Splunk Infrastructure // knowledge Objects.

Or you can install the Splunk Add-on for Unix and Linux to capture the processes running in the background.

Hope it helps!

Thanks

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...