Dashboards & Visualizations

How to create a dashboard on uptime reporting for management?

mmacdonald70
Explorer

I have been asked to come up with a dashboard for my management team. I am trying to pull it from some Nagios performance stats. The data has an icmp poll against every network device on the network, every 5 minutes. The data looks like this:

June 4 00:00:00 host_name = switch1 loss=0%
June 4 00:00:00 host_name = switch2 loss=100%
June 4 00:05:00 host_name = switch1 loss=0%
June 4 00:05:00 host_name = switch2 loss=0%

I created the following search

| eval ping_up=if(loss!="100%", 100,0) 
| stats avg(ping_up_ as uptime 
| eval uptime=round(uptime,2) 
| eval uptime = uptime 
| "%"

First of all, this doesn't seem very efficient. Second of all, now they are asking for a monthly trend over the past 2 years as well as a real-time dashboard (ie current uptime is X%). I can't seem to find a way to do these without a huge hit to the system.

0 Karma
1 Solution

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

View solution in original post

0 Karma

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

0 Karma

mmacdonald70
Explorer

I tried changing it to:

 index=logs earliest=1h@h | convert num(loss) as uptime| eval uptime=(100-uptime) | stats avg(uptime) as uptime 

That seems to work

0 Karma

sundareshr
Legend

Either that, or in your rename the fields to downtime 🙂

0 Karma

mmacdonald70
Explorer

Thanks. This looks much better, except that loss is actually packet loss (0% loss is good). This search gives me 0% uptime when I should be getting 100%. Any suggestions on reversing it?

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...