Dashboards & Visualizations

How to create a dashboard on uptime reporting for management?

mmacdonald70
Explorer

I have been asked to come up with a dashboard for my management team. I am trying to pull it from some Nagios performance stats. The data has an icmp poll against every network device on the network, every 5 minutes. The data looks like this:

June 4 00:00:00 host_name = switch1 loss=0%
June 4 00:00:00 host_name = switch2 loss=100%
June 4 00:05:00 host_name = switch1 loss=0%
June 4 00:05:00 host_name = switch2 loss=0%

I created the following search

| eval ping_up=if(loss!="100%", 100,0) 
| stats avg(ping_up_ as uptime 
| eval uptime=round(uptime,2) 
| eval uptime = uptime 
| "%"

First of all, this doesn't seem very efficient. Second of all, now they are asking for a monthly trend over the past 2 years as well as a real-time dashboard (ie current uptime is X%). I can't seem to find a way to do these without a huge hit to the system.

0 Karma
1 Solution

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

View solution in original post

0 Karma

sundareshr
Legend

For your "real-time" dashboard, you can use this . Change earliest to your liking. Display this as a single value and set the formatting there.

index=logs earliest=1h@h | convert num(loss) as uptime | stats avg(uptime) as uptime 

For the monthly trend

index=logs earliest=2y@y | convert num(loss) as uptime | timechart span=1mon avg(uptime) as uptime | eval uptime=tostring(uptime, "commas")."%"

To make this more efficient, you can save this as an accelerated report OR create a summary index and use that.

0 Karma

mmacdonald70
Explorer

I tried changing it to:

 index=logs earliest=1h@h | convert num(loss) as uptime| eval uptime=(100-uptime) | stats avg(uptime) as uptime 

That seems to work

0 Karma

sundareshr
Legend

Either that, or in your rename the fields to downtime 🙂

0 Karma

mmacdonald70
Explorer

Thanks. This looks much better, except that loss is actually packet loss (0% loss is good). This search gives me 0% uptime when I should be getting 100%. Any suggestions on reversing it?

0 Karma
Get Updates on the Splunk Community!

Monitoring MariaDB and MySQL

In a previous post, we explored monitoring PostgreSQL and general best practices around which metrics to ...

Financial Services Industry Use Cases, ITSI Best Practices, and More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Splunk Federated Analytics for Amazon Security Lake

Thursday, November 21, 2024  |  11AM PT / 2PM ET Register Now Join our session to see the technical ...