Dashboards & Visualizations

Trouble with creating a splunk service availability dashboard

Mohsin123
Path Finder

Hi ,

We are making a service availibility dashboard based on the below formula . Could you please help me implement this as a SPL ?

 

Availability Calculation of a service will be as follows-

 

Availability = (Total Availability hours – [ (End time of first P1 -Start Time of first  P1)

+ (End time of second P1 -Start Time of second  P1)+………])*100 /Total Availability Hours

Labels (1)
Tags (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @Mohsin123,

to help you, you should share more infos about the data:

what are P1, P2, ..., are they systems or what else?

how it's possibile to recognize start and end events for each  Px? is there a string for start event and another for end event or in the same event you have start and end?

could you share some sample of your events?

Ciao.

Giuseppe

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Are you concerned about overlapping P1s? For example, if the second P1 starts before the first one ends, or if the second P1 is completely within the time period of the second?

What level of granularity are you looking for? For example, sub-second, second, minute?

Can you share some of your raw events?

0 Karma

Mohsin123
Path Finder

#1.1 Service Continuity

Description:

This is a child-view of the Business Parent View.

This displays the Availability of applications based on the duration of all P1 incidents for the application in the month.

 

Data Flow:

Example: We have 2 P1 Incidents’s for AppName: ToolsGra1 in October

P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am 

P2 : 10/25/2021 7:15 am  to 10/23/2021 09:15 am 

For October ,we have = 24 hours *30 days =720 hours in the month

Total Duration of P1’s for ToolsGra1 for the month=3 hours

Availability of ToolsGra1 = ((720-3)/720)*100 =99.58

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This doesn't answer the questions about overlapping events or about granularity. Please can you clarify your requirements?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Mohsin123,

to help you, you should share more infos about the data:

what are P1, P2, ..., are they systems or what else?

how it's possibile to recognize start and end events for each  Px? is there a string for start event and another for end event or in the same event you have start and end?

could you share some sample of your events?

Ciao.

Giuseppe

0 Karma

Mohsin123
Path Finder

@gcusello @ITWhisperer 

i'm using this code -

My intention is to -

1. Calculate duration (passed from earliest latest tokens in my dashboard - earliest="$time.earliest$" latest="$time.latest$"

2. Calculate Days when service was unavailable

3. Calculate service avialability

problem statement : duration is not working - 

 

could you please help me . And thanks a lottt @gcusello  for your code 🙂

 

index=servicenow earliest="01/07/2021:00:00:00" latest=now()
| addinfo
| eval duration=round((info_max_time - info_min_time)/3600/24 ,0)
| dedup duration
| table duration


|appendcols [ search index="generic_servicenow" "xxxx" dv_priority="1 - Critical"
| dedup dv_closed_at
| dedup dv_sys_created_on
| timechart partial=f values(dv_closed_at) as endT values(dv_sys_created_on) as startT
| fields - _time
| sort startT
| eval startTime=strptime(startT,"%Y-%m-%d %T.%3Q")
| eval endTime=strptime(endT,"%Y-%m-%d %T.%3Q")
| eval diff=abs(startTime-endTime)/3600/24
| table endT startT endTime startTime diff
| fields diff
| stats sum(diff) as unavailability ]

| eval availability=round(((duration-unavailability)/duration)*100,2)

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Mohsin123,

sorry but I don't understand if you solved your problem or not.

You said that duration doesn't work but it seems a simple operation.

If it doesn't work, debug it separately, what's the result of this search?

index=servicenow earliest="01/07/2021:00:00:00" latest=now()
| addinfo
| eval duration=round((info_max_time - info_min_time)/3600/24 ,0)
| dedup duration
| table duration info_max_time info_min_time

maybe it's a format problem.

Ciao.

Giuseppe

P.S.: Karma Points are appreciated by all the Contributors 😉

Mohsin123
Path Finder

@gcusello  one more question 

if i get the last field ...

.........|table availability as 92.9 it shows in the dashboard,

but if its blank then it shows "no results"

Can i show it as 100 ? which ideally means theer's no p1 and my service is 100% avialable

 

i tried with if command --but not working

 

...|eval availability=if (availability=0,"100",availability)

 

its beacuse 0 events are results which ideally means no p1 in servce now 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Mohsin123 ,

you can find many answers about this issue.

You have to add something that doesn't modify result if present and give you a result when there isn't anyone, something like this, at the end of your search:

all your search
| append [ | makeresults | eval availability=0 | fields availability ]
| stats sum(availability) AS availability
| eval availability=if(availability=0,"100",availability)

Anyway, it's better to put the new question in a different post, so you can have more and quick answers.

Ciao.

Giuseppe

0 Karma

Mohsin123
Path Finder

okay i  did it  !!!!

 

| appendpipe
[ stats count
| where count=0|eval count=100]

🙂

0 Karma

Mohsin123
Path Finder

@gcusello @ITWhisperer   

Example: We have 2 P1 Incidents’s for AppName: ToolsGra1 in October

P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am 

P1 : 10/25/2021 7:15 am  to 10/23/2021 09:15 am 

For October ,we have = 24 hours *30 days =720 hours in the month

Total Duration of P1’s for ToolsGra1 for the month=3 hours

Availability of ToolsGra1 = ((720-3)/720)*100 =99.58

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Mohsin123,

the information "P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am" is in one event/record or in more events?

if it's in one record and the event is just

"P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am"

it's easy to calculate duration of P1 and calculate availability perc:

index=your_index
| rex field=ppp "^P1\s+:\s+(?<start_time>\d+\/\d+\/\d+\s+\d+:\d+\s+\w+)\s+to\s+(?<end_time>\d+\/\d+\/\d+\s+\d+:\d+\s+\w+)"
| eval start_time_epoch=strptime(start_time,"%m/%d/%Y %I:%M %p"), end_time_epoch=strptime(end_time,"%m/%d/%Y %I:%M %p")
| rex field=start_time_epoch "^(?<start_time_2>[^\.]+)"
| rex field=end_time_epoch "^(?<end_time_2>[^\.]+)"
| eval unavailability=end_time_2-start_time_2
| stats sum(unavailability) AS unavailability BY AppName
| eval month_seconds=31*24*3600
| eval availability=((month_seconds-unavailability)/month_seconds)*100
| table AppName anailability

You can see in this search the approach to calculate Availability perc, but it must be modified based on your log format.

Ciao.

Giuseppe

0 Karma

Mohsin123
Path Finder

Example: We have 2 P1 Incidents’s for AppName: ToolsGra1 in October

P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am 

P2 : 10/25/2021 7:15 am  to 10/23/2021 09:15 am 

For October ,we have = 24 hours *30 days =720 hours in the month

Total Duration of P1’s for ToolsGra1 for the month=3 hours

Availability of ToolsGra1 = ((720-3)/720)*100 =99.58

0 Karma

Mohsin123
Path Finder

#1.1 Service Continuity

Description:

This is a child-view of the Business Parent View.

This displays the Availability of applications based on the duration of all P1 incidents for the application in the month.

 

Data Flow:

Example: We have 2 P1 Incidents’s for AppName: ToolsGra1 in October

P1 : 10/23/2021 10:05 am  to 10/23/2021 11:05 am 

P2 : 10/25/2021 7:15 am  to 10/23/2021 09:15 am 

For October ,we have = 24 hours *30 days =720 hours in the month

Total Duration of P1’s for ToolsGra1 for the month=3 hours

Availability of ToolsGra1 = ((720-3)/720)*100 =99.58

Tags (1)
0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...