Dashboards & Visualizations

How to create a dashboard on uptime reporting for management?

mmacdonald70
Explorer

I have been asked to come up with a dashboard for my management team. I am trying to pull it from some Nagios performance stats. The data has an icmp poll against every network device on the network, every 5 minutes. The data looks like this:

June 4 00:00:00 host_name = switch1 loss=0%
June 4 00:00:00 host_name = switch2 loss=100%
June 4 00:05:00 host_name = switch1 loss=0%
June 4 00:05:00 host_name = switch2 loss=0%

I created the following search

| eval ping_up=if(loss!="100%", 100,0) 
| stats avg(ping_up_ as uptime 
| eval uptime=round(uptime,2) 
| eval uptime = uptime 
| "%"

First of all, this doesn't seem very efficient. Second of all, now they are asking for a monthly trend over the past 2 years as well as a real-time dashboard (ie current uptime is X%). I can't seem to find a way to do these without a huge hit to the system.

0 Karma
1 Solution

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

View solution in original post

0 Karma

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

0 Karma

mmacdonald70
Explorer

I tried changing it to:

 index=logs earliest=1h@h | convert num(loss) as uptime| eval uptime=(100-uptime) | stats avg(uptime) as uptime 

That seems to work

0 Karma

sundareshr
Legend

Either that, or in your rename the fields to downtime 🙂

0 Karma

mmacdonald70
Explorer

Thanks. This looks much better, except that loss is actually packet loss (0% loss is good). This search gives me 0% uptime when I should be getting 100%. Any suggestions on reversing it?

0 Karma
Get Updates on the Splunk Community!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...