Dashboards & Visualizations

Graphing counter delta values over multiple dimensions?

ruisantos
Path Finder

We have a script gathering DNS server statistics, which are monotonically increasing counters, mostly for requests served.

We have 3 dimensions to our data:

  • dns_host: the host where the statistics come from
  • bundle: a name for a collection of metrics
  • metric: a single metric (number of requests of type X, or events of type Y)

our hosts are grouped 4 by 4, so it makes sense to generate aggregate statistics for "requests per second all DNS of same type".

We've managed to graph the metric variations over time (requests per second) with a query like:

sourcetype="dns_stats" bundle="dns_queries_in" dns_host="dns04a"| sort _time | streamstats current=t global=f window=2 earliest(value) as curr latest(value) as next earliest(_time) as te latest(_time) as tf by metric | eval delta=(next-curr)/(tf-te) | timechart sum(delta) as delta by metric

However, this only graphs a single host's data, if we remove the dns_host= criterium the query falls apart, since the delta can't relate events over 2 dimensions (metric AND dns_host).

Similarly, we haven't been able to graph cumulative requests per second per host.

How can we:

  • graph requests per second per metric over all dns_hosts of same type like regex dns_host="dns4.*" ?
  • graph requests per second per dns_host?

We have full control over generation of data, so switching format is an option.

Our data look like this:

2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_out;metric=A;value=108723372
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_out;metric=MX;value=1185025
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_out;metric=AAAA;value=18344118
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_out;metric=ANY;value=124916
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_out;metric=AAAA;value=108801938
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_out;metric=A;value=686732013
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_out;metric=ANY;value=1283341
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_out;metric=MX;value=4930715
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_out;metric=AAAA;value=109385996
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_out;metric=A;value=700378600
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_out;metric=ANY;value=971869
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_out;metric=MX;value=4495108
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_out;metric=AAAA;value=108383976
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_out;metric=A;value=711446253
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_out;metric=ANY;value=990522
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_out;metric=MX;value=4657965

2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_in;metric=AAAA;value=153916458
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_in;metric=A;value=684622311
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_in;metric=ANY;value=190745078
2015-12-22 11:00:46.225341;dns_host=dns04a;bundle=dns_queries_in;metric=MX;value=926441
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_in;metric=AAAA;value=1099794598
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_in;metric=A;value=3572304139
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_in;metric=ANY;value=561378563
2015-12-22 11:00:52.323281;dns_host=dns14a;bundle=dns_queries_in;metric=MX;value=4034320
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_in;metric=AAAA;value=1237246618
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_in;metric=ANY;value=417989063
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_in;metric=A;value=3888269733
2015-12-22 11:00:58.102450;dns_host=dns04b;bundle=dns_queries_in;metric=MX;value=4180641
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_in;metric=AAAA;value=1225784262
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_in;metric=ANY;value=420711347
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_in;metric=A;value=3831717564
2015-12-22 11:01:03.660463;dns_host=dns14b;bundle=dns_queries_in;metric=MX;value=4363842
0 Karma
1 Solution

ruisantos
Path Finder

We've managed to do it with a query like:

sourcetype="dns_stats" bundle="dns_queries_in" | regex dns_host="dns.*4" | eval dim=dns_host."-".metric | bin _time span=4m | sort _time | streamstats current=t global=f earliest(value) as prev_v latest(value) as curr_v earliest(_time) as prev_t latest(_time) as curr_t by dim | eval rps=(curr_v-prev_v)/(curr_t-prev_t) | timechart sum(rps) as rps by metric

The query up until eval rps=... calculates the value over time delta for each host+metric, which is then summed up with a timechart.

To get a per-host graph, instead of per-metric, we just change the ending to timechart sum(rps) as rps by dns_host.

The bin _time span=4m is a hack which was introduced because stats for each host are gathered with a slight time difference (seconds), which is enough to create gaps in the graphs in some situations.

View solution in original post

0 Karma

ruisantos
Path Finder

We've managed to do it with a query like:

sourcetype="dns_stats" bundle="dns_queries_in" | regex dns_host="dns.*4" | eval dim=dns_host."-".metric | bin _time span=4m | sort _time | streamstats current=t global=f earliest(value) as prev_v latest(value) as curr_v earliest(_time) as prev_t latest(_time) as curr_t by dim | eval rps=(curr_v-prev_v)/(curr_t-prev_t) | timechart sum(rps) as rps by metric

The query up until eval rps=... calculates the value over time delta for each host+metric, which is then summed up with a timechart.

To get a per-host graph, instead of per-metric, we just change the ending to timechart sum(rps) as rps by dns_host.

The bin _time span=4m is a hack which was introduced because stats for each host are gathered with a slight time difference (seconds), which is enough to create gaps in the graphs in some situations.

0 Karma

jplumsdaine22
Influencer

The data sample you provided is just a single point in time, but I'm guessing the following searches might help

Range of hosts:

sourcetype="dns_stats" dns_host=dns4*| eval data-{dns_host}-{metric}-{bundle}=data | timechart per_second(data-*) as data-*

All hosts

sourcetype="dns_stats" | eval data-{dns_host}-{metric}-{bundle}=value | timechart per_second(data-*) as data-*

Have a look at these splunk answers
https://answers.splunk.com/answers/47037/delta-then-sum-then-graph-from-multiple-hosts.html
http://blogs.splunk.com/2014/05/28/search-commands-delta/

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...