Splunk Enterprise

Issues when recreating Prometheus Metrics Graphs from Grafana in Splunk

tankelvi
New Member

Hi,

I had tried to recreate Prometheus metrics graphs from Grafana in Splunk. However, I am getting offsets for the value of certain queries as shown in the case below:

Case 1:

  1. Queries that are using irate in PromQL
    Eg:
    1. PromQL Queries:
      (sum by(instance) (irate(node_cpu_seconds_total{instance="$node",job="$job", mode!="idle"}[$__rate_interval])) / on(instance) group_left sum by (instance)((irate(node_cpu_seconds_total{instance="$node",job="$job"}[$__rate_interval])))) * 100


      Result:
      tankelvi_0-1681443654322.png
    2. Splunk Queries:

      | mstats rate_sum(node_cpu_seconds_total) as seconds_total where index=<index_name> by job instance mode span=15s

      | sort - _time

      | dedup mode

      | stats sum(seconds_total) as seconds_total sum(eval(if(mode!="idle",seconds_total,0))) as cpu_busy

      | eval "CPU Busy" = round((cpu_busy / seconds_total) * 100,2)

      | fields "CPU Busy"

      Result:

tankelvi_1-1681443654325.png

 

2. Queries that are not using irate in PromQL:

    1. PromQL Queries:
      avg(node_load5{instance="$node",job="$job"}) /  count(count(node_cpu_seconds_total{instance="$node",job="$job"}) by (cpu)) * 100


      Result:
      tankelvi_2-1681443654328.png

       

    2. Splunk Queries:

      | mstats avg("node_load5") prestats=true WHERE index=<index_name> span=15s

      | table _time psrsvd_sm_node_load5

      | sort - _time

      | stats first(psrsvd_sm_node_load5) as latest_psrsvd_sm_node_load5

      | join type=inner[| mstats count(node_cpu_seconds_total) prestats=true WHERE index=<index_name> by cpu span=15s

      | table cpu psrsvd_ct_node_cpu_seconds_total

      | dedup cpu

      | stats count(cpu) as cpu_count

      | table cpu_count]

      | eval sys_load=latest_psrsvd_sm_node_load5 / cpu_count * 100

      | sort - sys_load

      | table sys_load


      Result:

tankelvi_3-1681443654331.png

 

 

Can I check is there answers or solutions on the following questions please?

  1. What is the irate equivalent aggregate functions that we should use in Splunk. For example: rate, rate_sum, rate_avg or another that is not listed?
  2. What might be the cause of the values in Grafana and Splunk having an offset?

Thank you very much.

Labels (1)
0 Karma
Get Updates on the Splunk Community!

Dashboards: Hiding charts while search is being executed and other uses for tokens

There are a couple of features of SimpleXML / Classic dashboards that can be used to enhance the user ...

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

This is the fourth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

When you think of Boston, you might picture colonial charm, world-class universities, or even the crack of a ...