Splunk Search

Calculation of availability

wcastillocruz
Path Finder
Hello dear community.
I'm a beginner on Splunk. I would like to have your help today on a project that I am doing. I have to calculate the availability of application services. I have an entry from a database using Splunk DB connect. in these data I receive all the events listed in a DB of a monitoring Tools. I would like to calculate via the timestamp the duration of an incident between the moment when the status is failed and the return to normal. it is difficult because an event can occur several times in the day so I have to find a foreach which will read a line with a severity 2 "critical" and its return to normal for this line with severity 0 "OK" using the timestamp because its return to normal occurs after of course. I don't know if I managed to explain the problematic. thank you for your precious help.

 

 

 

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @wcastillocruz,

at first, do not use the table command before the transaction command, use it after transaction.

Then try to use quotes in startswith and endswith.

At least, to debug transaction, try the transaction without startswith and endswith and/or without one of the other transaction keys, maybe you have too many conditions and there isn't any results to all your conditions.

Ciao.

Giuseppe

View solution in original post

0 Karma

abhijeet01
Path Finder

Hi wcastillocruz,

I think the filed called node_ref would be help you to find the duration between when the incident triggered and when it closed. This is possible by transaction command.

You need to merge those events which coming along with "node_ref" field but value should be same in all those events where incident occured and then you will get the exact duration.

  1. index=xyz node_ref=*
  2. transaction node_ref
  3. | stats values(filed_1) AS "field_1" values(duration) AS "Duration" by node_ref eventcount
  4. | fields - eventcount

 

Please find below link of Splunk document will be help you.

https://docs.splunk.com/Documentation/Splunk/8.1.0/SearchReference/Transaction

 

gcusello
SplunkTrust
SplunkTrust

Hi @wcastillocruz,

is there in your data an ID to identify transaction?

If yes, you can run something like this:

| stats earliest(_time) AS earliest latest(_time) AS latest BY ID

in this way you have the start and the end of each transaction and you can define the uptime.

If instead the only way to define a transaction are a starting and an ending string, you can use the transaction command:

| transaction startswith="start_string" endswith="end_string"

and in this way you have the duration of the transaction.

Ciao.

Giuseppe

wcastillocruz
Path Finder
Hello @gcusello,
thank you for responding quickly.
this is another difficulty because I do not have a unique identifier to identify the "Critical" alert and the "OK" alert
I just have a unique reference for each row.
I think I should use the timestamp using multiple columns as a single index for after a "critical" alert take the next value in the temp which will match Env + App + varname
with a severity 0, and this is where I calculate the duration of the incident.
am I clear?
Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @wcastillocruz,

yes, you can correlate events using both the common fields (Env + App + varname) and the conditions of start and end transaction, something like this:

| transaction Env App varname startswith="start_string" endswith="end_string"

 in this way you have the duration of the transaction.

Ciao.

Giuseppe

wcastillocruz
Path Finder

Hello @gcusello, sorry to reply late. I tried your suggestion. it seems to be correct but I cannot determine the value of "startswitch" and "endswitch". this is a value of the Env Apps varname columns? I added another column: description and use the value "FAILED" for "startswitch" and "OK" for "endswitch" but it doesn't return anything.

 

 
 

Capture.PNG

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @wcastillocruz,

at first, do not use the table command before the transaction command, use it after transaction.

Then try to use quotes in startswith and endswith.

At least, to debug transaction, try the transaction without startswith and endswith and/or without one of the other transaction keys, maybe you have too many conditions and there isn't any results to all your conditions.

Ciao.

Giuseppe

0 Karma

wcastillocruz
Path Finder

Hi @gcusello, Thanks for your help. I managed to display the start of an event and its return to the nourmal on the same line. to close my question, I will need a last boost: is it possible to subtract the two timestamps contained in the same field. The goal is to know the exact duration of the incident?

Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @wcastillocruz,

the transaction command already generates a field called "duration" that's ready for you.

Ciao.

Giuseppe

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...