Splunk Search

How to calculate a duration percentage in a transaction search for non-existent events?

New Member

I have this specific issue where I'm trying to calculate percentage of online time for a set of devices.

I created following search:

...| transaction startswith="offline state start" endswith="offline state end" | stats sum(duration) as total_offline | eval online=100*(86400-total_offline )/86400  

This works fine if "offline state start" and "offline state end" exists in the logs and I can calculate online state percent based on that, but if those strings do not exist in logs, then how can I calculate online percent? Ideally, it should be 100% online if everything is fine (meaning if it could not find offline events) but how can I successfully execute eval to 100% if the logs do not have "offline state start" and "offline state end"?

0 Karma

SplunkTrust
SplunkTrust

Do your events have a unique identifier tied to the "offline state start" and "offline state end" to represent that they are pairs? If not then I suggest combining those logs together as 1 event at index time rather than search time.

To do this you will need to go on the indexer and go to Splunk_Home/etc/system/local and edit your props.conf file to include those independent events as 1. Share some sample data and I'll help you out

0 Karma

SplunkTrust
SplunkTrust

So you have from=offline state start, to=offline state end, in the same event.. If you're going to group them with a transaction command then they need to be separate events. Below I pasted your example but removed the "offline state end" from the first event and removed "offline state start" from your second event. So in this case you can use a transaction command to group those independent events into one event.

[2015-08-11 00:38:53,747] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, action=timeout

[2015-08-10 23:33:47,244] INFO tracking.WidgetStateMachine [tid-339] [request_id=1122334455666:6:0] : cmd=TRANSIT_STATE, device_id=112233445566, to=offline state end, action=connect

index=whatever | transaction startswith="offline state start" endswith="offline state end" | stats sum(duration) as total_offline | eval online=100*(86400-total_offline )/86400

Since you have both "offline state start" and "offline state end" in the same event already then you do not need to group them together. Is the device_id unique to each start/end? You need a unique field to tie the first event to the last event

0 Karma

New Member

Sorry I'm new to Splunk so excuse my ignorance.

Actually, the logs I had pasted before was "after" running the transaction command . That is why I think they (from=offline state start, to=offline state end, )were found in the same event.

To answer your question about uniqueness of the deviceid....
device
id is unique. So the situation is as follows:

  • Multiple devices exist in test
  • Each device could go offline multiple times during the test period ( say 24 hours). For example: Device-1 could go offline for first 4 hours and get back online for another hour and go offline for another 2 hours. So essentially, I need to calculate 6 hours of total offline period (duration) for Device-1. Similarly, I need to calculate offline duration for all devices in the test and group them by device_id
  • And based on the offline duration, I need calculate online duration (and percentage). The problem is to build a query which could address following:

Multiple devices are offline occasionally; calculate online percentage over the period of 24 hours.
None of the devices are offline during the period of 24 hours; calculate online percentage (and this will be 100% obviously since there are no offline devices).

0 Karma

New Member

Hi skoelpin,

I actually forgot to mention the field in transaction command. Its device_id as mentioned below.

...| transaction device_id  startswith="offline state start" endswith="offline state end" | ...

So if the strings mentioned in startswith and endswith are available in logs, then I'd see events mentioned below (sample log)

Here is a sample log:

[2015-08-11 00:38:53,747] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=timeout
 [2015-08-11 14:16:25,001] INFO tracking.WidgetStateMachine [tid-390] [request_id=1122334455666:0:FB5511D6] : cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=connect

[2015-08-10 16:21:16,071] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=timeout 
[2015-08-10 23:33:47,244] INFO tracking.WidgetStateMachine [tid-339] [request_id=1122334455666:6:0] : cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=connect
0 Karma