Is there a way to create a relationship between 2 fields, using a row concept, for use with stats or chart?
A report I am building uses stats to get latest values per user from the raw data. In the data, a UserID is unique to each user, but is only present in specific events ("auth" events). All events are tagged with a VisitorID, which can change (i.e. each user can have multiple visitorId's).
Here's an example to show how I have built my report
eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeTime=if(eventtype="SomeEvent",_time,NULL)
| eval OtherTime=if(eventtype="OtherTime",_time,NULL)
| stats latest(UserID) latest(SomeValue) as SomeValue latest(SomeTime) as SomeTime latest(OtherTime) as OtherTime latest(OtherValue) as OtherValue by VisitorID
From here, I get a list of UserID's, *Values and *Times by VisitorID:
VisitorID UserID SomeValue SomeTime OtherValue OtherTime
a 001 1 1012
b 001 3 987 3 1007
c 001 5 1008
What I am trying to do here for my report is to find the most recent *Value by User (the stats command above looses _time field, so I propogate it with eval via EVENTTime)
... | stats SomeValue[max(SomeTime)] as SomeValue OtherValue[max(OtherTime)] by UseerID
This would give me the correct data: UserID 001, SomeValue 5, OtherValue 3
is there a way to do this with chart or stats? Some sort of compound use of stats / chart functions?
Here's a set of example data that could produce the data above by VisitorID:
visitor: b; userid: 001; type: authentication
visitor: b; type: someEvent; someValue: 3
visitor: b; type: otherEvent; otherValue: 1
visitor: c; userid: 001; type: authentication
visitor: c; type: someEvent; someValue: 5
visitor: a; userid: 001; type: authentication
visitor: a; type: otherEvent; otherValue: 3
thanks
brett
In order to get this to work, and get the most recent event per userId, regardless of visitor ID, I used a concatenation with the time, combined with a sort, and then split with mvindex later on. Here's an example:
eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeDetail=if(eventtype="SomeEvent",_time." ".SomeValue,NULL)
| eval OtherDetail=if(eventtype="OtherEvent",_time." ".OtherValue,NULL)
| stats latest(UserID) as UserID latest(SomeDetail) as SomeDetail latest(OtherDetail) as OtherDetail by VisitorID
| stats max(SomeDetail) as SomeDetail max(OtherDetail) as OtherDetail by UserID
| eval SomeValue=mvindex(split(SomeDetail," "),1)
| eval OtherValue=mvindex(split(OtherDetail," "),1)
This way, by selecting the max per UserID i get the latest *Detail field, regardless of what the visitorID was (the max has the largest _time).
In order to get this to work, and get the most recent event per userId, regardless of visitor ID, I used a concatenation with the time, combined with a sort, and then split with mvindex later on. Here's an example:
eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeDetail=if(eventtype="SomeEvent",_time." ".SomeValue,NULL)
| eval OtherDetail=if(eventtype="OtherEvent",_time." ".OtherValue,NULL)
| stats latest(UserID) as UserID latest(SomeDetail) as SomeDetail latest(OtherDetail) as OtherDetail by VisitorID
| stats max(SomeDetail) as SomeDetail max(OtherDetail) as OtherDetail by UserID
| eval SomeValue=mvindex(split(SomeDetail," "),1)
| eval OtherValue=mvindex(split(OtherDetail," "),1)
This way, by selecting the max per UserID i get the latest *Detail field, regardless of what the visitorID was (the max has the largest _time).
So the problem you really need to solve is how to get the UserID in each of the events based on VisitorID. I think the rest is trivial after that ?
If VisitorID is also unique,
... | eventstats first(UserId) as UserId by VisitorID
Will populate the correct UserId in each event, and then you can use UserId in your 'by' clauses.
sorry, to be more precise, I was getting the UserID into each event. In my example above, i was doing a latest(UserID) in the first "stats" command (my actual search in splunk was latest(UserID) as UserID
, which is exactly as you suggested 🙂
well not quite - i was doing a first, or doing a last with a sort between the 2 stats, but this could give incorrect "last" or "first" values in teh 2nd stats not necessarily being the most recent. I'll post the way I got it to work below...