Dashboards & Visualizations

How to create a dashboard on uptime reporting for management?

mmacdonald70
Explorer

I have been asked to come up with a dashboard for my management team. I am trying to pull it from some Nagios performance stats. The data has an icmp poll against every network device on the network, every 5 minutes. The data looks like this:

June 4 00:00:00 host_name = switch1 loss=0%
June 4 00:00:00 host_name = switch2 loss=100%
June 4 00:05:00 host_name = switch1 loss=0%
June 4 00:05:00 host_name = switch2 loss=0%

I created the following search

| eval ping_up=if(loss!="100%", 100,0) 
| stats avg(ping_up_ as uptime 
| eval uptime=round(uptime,2) 
| eval uptime = uptime 
| "%"

First of all, this doesn't seem very efficient. Second of all, now they are asking for a monthly trend over the past 2 years as well as a real-time dashboard (ie current uptime is X%). I can't seem to find a way to do these without a huge hit to the system.

0 Karma
1 Solution

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

View solution in original post

0 Karma

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

0 Karma

mmacdonald70
Explorer

I tried changing it to:

 index=logs earliest=1h@h | convert num(loss) as uptime| eval uptime=(100-uptime) | stats avg(uptime) as uptime 

That seems to work

0 Karma

sundareshr
Legend

Either that, or in your rename the fields to downtime 🙂

0 Karma

mmacdonald70
Explorer

Thanks. This looks much better, except that loss is actually packet loss (0% loss is good). This search gives me 0% uptime when I should be getting 100%. Any suggestions on reversing it?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...