How can I find what source running in splunk is ca...

Log_wrangler · ‎08-16-2018

I received a warning

Search peer ip-1-1-1-1.ec2.internal has the following message: skipped indexing of internal audit event will keep dropping events until indexer congestion is remedied. Check disk space and other issues that may cause indexer to block

Uptime shows very high CPU utilization on the server.

Is there some query in _internal I can use to see if an app or alert or source or schedule task is causing this?

Thank you

martin_mueller · ‎11-04-2018

Obvious side-note: Do upgrade your Splunk. There have been tons of improvements made since 5.0, and 5.0 is out of support - end of life was reached a year ago.
The introspection endpoints used by the monitoring console - and the search that doesn't work for you - are just a tiny sliver of that pie.

sudosplunk · ‎08-16-2018

If you've setup Monitoring console, then you can make use of Resource Usage tab which provides information about the resource usage in your deployment. Definitely a good starting point.

Here is the search which splunk uses to calculate Resource Usage: Deployment. See if this works for you.

| rest splunk_server_group=* splunk_server_group="*" /services/server/status/resource-usage/hostwide
| join type=outer splunk_server [
  | rest splunk_server_group=* splunk_server_group="*" /services/server/status/resource-usage/iostats
  | eval iops = round(reads_ps + writes_ps)
  | eval iops_mountpoint = iops." (".mount_point.")"
  | eval cpupct_mountpoint = cpu_pct."% (".mount_point.")"
  | stats values(iops_mountpoint) as iops_mountpoint, values(cpupct_mountpoint) as cpupct_mountpoint by splunk_server]
| eventstats min(eval(if(isnull(normalized_load_avg_1min), "0", "1"))) as _load_avg_full_availability
| eval normalized_load_avg_1min = if(isnull(normalized_load_avg_1min), "N/A", normalized_load_avg_1min)
| eval core_info = if(isnull(cpu_count), "N/A", cpu_count)." / ".if(isnull(virtual_cpu_count), "N/A", virtual_cpu_count)
| eval cpu_usage = cpu_system_pct + cpu_user_pct
| eval mem_used_pct = round(mem_used / mem * 100 , 2)
| eval mem_used = round(mem_used, 0)
| eval mem = round(mem, 0)
| fields splunk_server, normalized_load_avg_1min, core_info, cpu_usage, mem, mem_used, mem_used_pct, iops_mountpoint, cpupct_mountpoint
| sort - cpu_usage, -mem_used
| rename splunk_server AS Instance, normalized_load_avg_1min AS "Load Average", core_info AS "CPU Cores (Physical / Virtual)", cpu_usage AS "CPU Usage (%)", mem AS "Physical Memory Capacity (MB)", mem_used AS "Physical Memory Usage (MB)", mem_used_pct AS "Physical Memory Usage (%)", iops_mountpoint as "I/O Operations per second (Mount Point)", cpupct_mountpoint as "I/O Bandwidth Utilization (Mount Point)"

Log_wrangler · ‎08-16-2018

thank you for the reply, unfortunately the index is on v5.x , and there is no Monitor console on that version.

I am afraid to run your query as it might overload the indexer.

But I will look at it... and give it a shot.

Log_wrangler · ‎08-16-2018

the query did not work

sudosplunk · ‎08-16-2018

I will convert this to comment so that other can help you meanwhile.

Log_wrangler · ‎11-02-2018

Your answer is correct... however, I just had an old version.

Please convert to answer and I will accept.

How can I find what source running in splunk is causing the linux server to spike in CPU utilizations?

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)