Splunk Enterprise

Issues when recreating Prometheus Metrics Graphs from Grafana in Splunk

tankelvi
New Member

Hi,

I had tried to recreate Prometheus metrics graphs from Grafana in Splunk. However, I am getting offsets for the value of certain queries as shown in the case below:

Case 1:

  1. Queries that are using irate in PromQL
    Eg:
    1. PromQL Queries:
      (sum by(instance) (irate(node_cpu_seconds_total{instance="$node",job="$job", mode!="idle"}[$__rate_interval])) / on(instance) group_left sum by (instance)((irate(node_cpu_seconds_total{instance="$node",job="$job"}[$__rate_interval])))) * 100


      Result:
      tankelvi_0-1681443654322.png
    2. Splunk Queries:

      | mstats rate_sum(node_cpu_seconds_total) as seconds_total where index=<index_name> by job instance mode span=15s

      | sort - _time

      | dedup mode

      | stats sum(seconds_total) as seconds_total sum(eval(if(mode!="idle",seconds_total,0))) as cpu_busy

      | eval "CPU Busy" = round((cpu_busy / seconds_total) * 100,2)

      | fields "CPU Busy"

      Result:

tankelvi_1-1681443654325.png

 

2. Queries that are not using irate in PromQL:

    1. PromQL Queries:
      avg(node_load5{instance="$node",job="$job"}) /  count(count(node_cpu_seconds_total{instance="$node",job="$job"}) by (cpu)) * 100


      Result:
      tankelvi_2-1681443654328.png

       

    2. Splunk Queries:

      | mstats avg("node_load5") prestats=true WHERE index=<index_name> span=15s

      | table _time psrsvd_sm_node_load5

      | sort - _time

      | stats first(psrsvd_sm_node_load5) as latest_psrsvd_sm_node_load5

      | join type=inner[| mstats count(node_cpu_seconds_total) prestats=true WHERE index=<index_name> by cpu span=15s

      | table cpu psrsvd_ct_node_cpu_seconds_total

      | dedup cpu

      | stats count(cpu) as cpu_count

      | table cpu_count]

      | eval sys_load=latest_psrsvd_sm_node_load5 / cpu_count * 100

      | sort - sys_load

      | table sys_load


      Result:

tankelvi_3-1681443654331.png

 

 

Can I check is there answers or solutions on the following questions please?

  1. What is the irate equivalent aggregate functions that we should use in Splunk. For example: rate, rate_sum, rate_avg or another that is not listed?
  2. What might be the cause of the values in Grafana and Splunk having an offset?

Thank you very much.

Labels (1)
0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...