Splunk Search

how to get correlated data with stats

Builder

Is there a way to create a relationship between 2 fields, using a row concept, for use with stats or chart?

A report I am building uses stats to get latest values per user from the raw data. In the data, a UserID is unique to each user, but is only present in specific events ("auth" events). All events are tagged with a VisitorID, which can change (i.e. each user can have multiple visitorId's).

Here's an example to show how I have built my report

eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeTime=if(eventtype="SomeEvent",_time,NULL)
| eval OtherTime=if(eventtype="OtherTime",_time,NULL)
| stats latest(UserID) latest(SomeValue) as SomeValue latest(SomeTime) as SomeTime latest(OtherTime) as OtherTime latest(OtherValue) as OtherValue by VisitorID

From here, I get a list of UserID's, *Values and *Times by VisitorID:

VisitorID UserID  SomeValue   SomeTime  OtherValue  OtherTime
a         001                           1           1012
b         001     3           987       3           1007
c         001     5           1008      

What I am trying to do here for my report is to find the most recent *Value by User (the stats command above looses _time field, so I propogate it with eval via EVENTTime)

... | stats SomeValue[max(SomeTime)] as SomeValue OtherValue[max(OtherTime)] by UseerID

This would give me the correct data: UserID 001, SomeValue 5, OtherValue 3

is there a way to do this with chart or stats? Some sort of compound use of stats / chart functions?

Here's a set of example data that could produce the data above by VisitorID:

visitor: b; userid: 001; type: authentication
visitor: b; type: someEvent; someValue: 3
visitor: b; type: otherEvent; otherValue: 1
visitor: c; userid: 001; type: authentication
visitor: c; type: someEvent; someValue: 5
visitor: a; userid: 001; type: authentication
visitor: a; type: otherEvent; otherValue: 3    

thanks
brett

0 Karma
1 Solution

Builder

In order to get this to work, and get the most recent event per userId, regardless of visitor ID, I used a concatenation with the time, combined with a sort, and then split with mvindex later on. Here's an example:

eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeDetail=if(eventtype="SomeEvent",_time." ".SomeValue,NULL)
| eval OtherDetail=if(eventtype="OtherEvent",_time." ".OtherValue,NULL)
| stats latest(UserID) as UserID latest(SomeDetail) as SomeDetail latest(OtherDetail) as OtherDetail by VisitorID
| stats max(SomeDetail) as SomeDetail max(OtherDetail) as OtherDetail by UserID
| eval SomeValue=mvindex(split(SomeDetail," "),1)
| eval OtherValue=mvindex(split(OtherDetail," "),1)

This way, by selecting the max per UserID i get the latest *Detail field, regardless of what the visitorID was (the max has the largest _time).

View solution in original post

0 Karma

Builder

In order to get this to work, and get the most recent event per userId, regardless of visitor ID, I used a concatenation with the time, combined with a sort, and then split with mvindex later on. Here's an example:

eventtype="SomeEvent" OR eventtype="OtherEvent" OR eventtype="Authentication"
| eval SomeDetail=if(eventtype="SomeEvent",_time." ".SomeValue,NULL)
| eval OtherDetail=if(eventtype="OtherEvent",_time." ".OtherValue,NULL)
| stats latest(UserID) as UserID latest(SomeDetail) as SomeDetail latest(OtherDetail) as OtherDetail by VisitorID
| stats max(SomeDetail) as SomeDetail max(OtherDetail) as OtherDetail by UserID
| eval SomeValue=mvindex(split(SomeDetail," "),1)
| eval OtherValue=mvindex(split(OtherDetail," "),1)

This way, by selecting the max per UserID i get the latest *Detail field, regardless of what the visitorID was (the max has the largest _time).

View solution in original post

0 Karma

Influencer

So the problem you really need to solve is how to get the UserID in each of the events based on VisitorID. I think the rest is trivial after that ?

If VisitorID is also unique,

... | eventstats first(UserId) as UserId by VisitorID  

Will populate the correct UserId in each event, and then you can use UserId in your 'by' clauses.

Builder

sorry, to be more precise, I was getting the UserID into each event. In my example above, i was doing a latest(UserID) in the first "stats" command (my actual search in splunk was latest(UserID) as UserID, which is exactly as you suggested 🙂

0 Karma

Builder

well not quite - i was doing a first, or doing a last with a sort between the 2 stats, but this could give incorrect "last" or "first" values in teh 2nd stats not necessarily being the most recent. I'll post the way I got it to work below...

0 Karma