Splunk Search

Compute row differences

guilhem
Contributor

Hello everyone!

I have a very simple result table that looks like this:

_time s duration
2/10/13 12:20:22.000 PM 21279054471791556300 0
2/10/13 12:20:43.000 PM 21279054471791556300 21
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:52.000 PM 21279054471791556300 90
2/10/13 12:22:26.000 PM 21279054471791556300 124
2/10/13 12:22:59.000 PM 21279054471791556300 157
2/10/13 12:23:07.000 PM 21279054471791556300 165
2/11/13 5:09:16.000 AM 21242230731515268458 131
2/11/13 5:09:38.000 AM 21242230731515268458 153
2/11/13 5:09:46.000 AM 21242230731515268458 161
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:10:19.000 AM 21242230731515268458 194
2/11/13 5:10:24.000 AM 21242230731515268458 199
2/11/13 5:10:37.000 AM 21242230731515268458 212
2/11/13 5:10:45.000 AM 21242230731515268458 220
2/11/13 5:10:50.000 AM 21242230731515268458 225

And I would like to compute a new field, let say diff, that will contains the difference between each duration, row by row: like this:

_time s duration diff
2/10/13 12:20:22.000 PM 21279054471791556300 0 / Nothing
2/10/13 12:20:43.000 PM 21279054471791556300 21 21
2/10/13 12:21:07.000 PM 21279054471791556300 45 23
2/10/13 12:21:07.000 PM 21279054471791556300 45 0
2/10/13 12:21:52.000 PM 21279054471791556300 90 45
2/10/13 12:22:26.000 PM 21279054471791556300 124 69
2/10/13 12:22:59.000 PM 21279054471791556300 157 33
2/10/13 12:23:07.000 PM 21279054471791556300 165 8
2/11/13 5:09:16.000 AM 21242230731515268458 131 / Nothing
2/11/13 5:09:38.000 AM 21242230731515268458 153 22
2/11/13 5:09:46.000 AM 21242230731515268458 161 18
2/11/13 5:09:53.000 AM 21242230731515268458 168 7
2/11/13 5:09:53.000 AM 21242230731515268458 168 0
2/11/13 5:10:19.000 AM 21242230731515268458 194 26
2/11/13 5:10:24.000 AM 21242230731515268458 199 5
2/11/13 5:10:37.000 AM 21242230731515268458 212 13
2/11/13 5:10:45.000 AM 21242230731515268458 220 8
2/11/13 5:10:50.000 AM 21242230731515268458 225 5

I have tried to use the delta command, but it doesn't work because events from several s can occur at the same time.

I want to avoid using double makemv / mvexpand and compute the difference between all and filter after, as it will cost a lot of time (n square) and I really only need the duration(n) - duration(n-1) value

Thanks a lot!

Guilhem

1 Solution

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

View solution in original post

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

martin_mueller
SplunkTrust
SplunkTrust

list(duration) sounds quite cumbersome when you're basically just looking for last(duration)...

guilhem
Contributor

Be carefull thow, as I have just discovered, the lis() function only returns the first 100 results, so it may be convenient to use something else (penultimate duration value, instead of list).

I will update the answer when I found out how to find the penultimate value of a field using stats.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

guilhem
Contributor

Perfect, using the current=f to get the last-but-one value is clever!

I update the answer.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Consider this to get around the list/mvindex thingy:

... | streamstats current=f last(duration) as last_duration by s | eval diff = duration - last_duration

AshimaE
Explorer

Could you explain what current=f is used for here. Also last(x) takes us to the oldest entry for that s isn't it. so how is it being used here exactly. sorry for the naive doubts. im still a newbie.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...