Hi, I have a log that tracks user changes to a specific field in a form. The process is as follows:
1. The user accesses the form, which generates a log event with "get" eventtype along with the current value of field1. This can occur several times as the user refreshes the page, or through code behind the scenes that generates an event based on how long the user stays on the page.
2. The user fills in the form and hits submit, which logs an event with "update" eventtype.
Here's a simplified list of events:
_time, eventtype, sessionid, field1
10:06 update session2 newvalue3
10:05 get session2 newvalue2
09:15 update session1 newvalue2
09:12 get session1 newvalue1
09:10 get session1 newvalue1
09:09 update session1 newvalue1
09:02 get session1 oldvalue1
09:01 get session1 oldvalue1
08:59 get session1 oldvalue1
I'm looking to get the last value of field1 before each "update" eventtype. Basically I'd like to track what the value was before and what it was changed to, something like:
_time, Before, After
10:06 newvalue2 newvalue3
09:15 newvalue1 newvalue2
09:09 oldvalue1 newvalue1
I've tried this with a transaction command on the session, but I run into issues with the multiple instances "get" events in the same session, which makes it a little convoluted to extract the running values of field1. I also tried this with a combination of the latest(field1) and earliest(field1), but then this misses any updates that might take place within the session - we sometimes have users who change the value and then change it back. I'd like to capture those events as well.
Does anyone have any tips on how to get this accomplished?
Thanks!
First, thanks for clearly illustrating raw input, desired output, and the logic to get from there. Transaction is still the easiest way to go. You just need to keep track of which value is which eventtype.
Many people here are familiar with the traditional technique of using string concatenation. I will show a more semantic approach afforded by JSON functions introduced in 8.1.
| rename _raw as temp ``` only if you want to preserve _raw for later ```
| tojson eventtype, field1
| transaction startswith="eventtype=get" endswith="eventtype=update"
| eval _raw = split(_raw, "
")
| eval Before = json_extract(mvindex(_raw, 0), "field1"), After = json_extract(mvindex(_raw, 1), "field1")
| rename temp as _raw ``` only if you want to preserve _raw for later ```
| fields Before, After
Note: The above is not completely semantic as I am also using the side effect of Splunk's default of lexical order.
Here is an emulation for you to play with and compare with real data.
| makeresults format=csv data="_time, eventtype, sessionid, field1
10:06, update, session2, newvalue3
10:05, get, session2, newvalue2
09:15, update, session1, newvalue2
09:12, get, session1, newvalue1
09:10, get, session1, newvalue1
09:09, update, session1, newvalue1
09:02, get, session1, oldvalue1
09:01, get, session1, oldvalue1
08:59, get, session1, oldvalue1"
| eval _time = strptime("2024-08-22T" . _time, "%FT%H:%M")
``` data emulation above ```
Output from the above search gives
Before | After | _time |
newvalue2 | newvalue3 | 2024-08-22 10:05:00 |
newvalue1 | newvalue2 | 2024-08-22 09:12:00 |
oldvalue1 | newvalue1 | 2024-08-22 09:02:00 |
Amazing, worked like a charm. Thanks!
First, thanks for clearly illustrating raw input, desired output, and the logic to get from there. Transaction is still the easiest way to go. You just need to keep track of which value is which eventtype.
Many people here are familiar with the traditional technique of using string concatenation. I will show a more semantic approach afforded by JSON functions introduced in 8.1.
| rename _raw as temp ``` only if you want to preserve _raw for later ```
| tojson eventtype, field1
| transaction startswith="eventtype=get" endswith="eventtype=update"
| eval _raw = split(_raw, "
")
| eval Before = json_extract(mvindex(_raw, 0), "field1"), After = json_extract(mvindex(_raw, 1), "field1")
| rename temp as _raw ``` only if you want to preserve _raw for later ```
| fields Before, After
Note: The above is not completely semantic as I am also using the side effect of Splunk's default of lexical order.
Here is an emulation for you to play with and compare with real data.
| makeresults format=csv data="_time, eventtype, sessionid, field1
10:06, update, session2, newvalue3
10:05, get, session2, newvalue2
09:15, update, session1, newvalue2
09:12, get, session1, newvalue1
09:10, get, session1, newvalue1
09:09, update, session1, newvalue1
09:02, get, session1, oldvalue1
09:01, get, session1, oldvalue1
08:59, get, session1, oldvalue1"
| eval _time = strptime("2024-08-22T" . _time, "%FT%H:%M")
``` data emulation above ```
Output from the above search gives
Before | After | _time |
newvalue2 | newvalue3 | 2024-08-22 10:05:00 |
newvalue1 | newvalue2 | 2024-08-22 09:12:00 |
oldvalue1 | newvalue1 | 2024-08-22 09:02:00 |