Splunk Search

Multiple Stats Multiple Indexes

willadams
Contributor

I have 3 data sets that I need to combine with 1 data set not having a field to perform a compare.  I initially started with a join but moved away from that given the limited results returned and also to save on processing time of running the query.  My data sets are:

Users Index

index=users sourcetype=userlist

This has the following fields

user_id
user_title
user_name

 

Workstations Index

index=workstations sourcetype=machines

This has the following fields

pc_id
pc_type
user_name

 

I can then do a stats instead of join on this data using user_name as the "join".  So my query is as follows (note a user can have more than 1 PC hence the mvexpand to break into individual entries)

 

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines)
| stats values(*) as * by user_name
| mvexpand pc_id
| table pc_id pc_type user_name

 

I now want to take the information I have from this search with various other filters from another source (for example 

 

index=windows sourcetype=connections which has lots of fields but I am only interested in the following

device_name
last_activity_time

 

I want to be able to match the results from my combined search and extend this to check against the 3rd index, matching on device_name against pc_id.  I am thinking that my query might add another statement into the first line of my SPL for example 

(i.e. index=users ...) OR (index=workstations .....) OR (index=windows | fields device_name last_activity_time)
| stats values(*) as * by user_name

 

however I don't think this will work as user_name is not a field in the "windows" index.  I am not sure how to now extend this to then look into another index without then doing a join. 

 

Labels (2)
0 Karma
1 Solution

manjunathmeti
Champion

hi @willadams,
Use coalesce function on fields pc_id and device_name, then do stats on pc_id and user_name.

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR index=windows sourcetype=connections 
| fields user_id, user_title, user_name, pc_id, pc_type, device_name, last_activity_time
| eval pc_id=coalesce(pc_id, device_name)
| stats values(*) as * by pc_id, user_name
| mvexpand pc_id

View solution in original post

0 Karma

manjunathmeti
Champion

hi @willadams,
Use coalesce function on fields pc_id and device_name, then do stats on pc_id and user_name.

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR index=windows sourcetype=connections 
| fields user_id, user_title, user_name, pc_id, pc_type, device_name, last_activity_time
| eval pc_id=coalesce(pc_id, device_name)
| stats values(*) as * by pc_id, user_name
| mvexpand pc_id

View solution in original post

0 Karma

willadams
Contributor

@manjunathmeti thanks.  I have run into a problem though and realised that 2 of my indexers have the same field "names" in the index but different values.  Do you know if there is anyway to do a rename of the fields before stats based on the source of the data and then to do values as post processing?  For example

 

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections | rename last_activity_time AS win_last_activity)

 

This will generate a error in search (unable to parse the search: unbalanced parentheses). I have tried adjusting to something similar such as

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections [| rename last_activity_time AS win_last_activity])

or even

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections (| rename last_activity_time AS win_last_activity))

but have no luck so far. The only solution so far would suggest a change to my data source which would be a massive undertaking.  My only other thoughts may be to have a lookup table done from my last data set and change the values and then update this query using something like 

 

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (lookup pc_activity_log.csv)

 

but would much prefer to keep it in a single query if at all possible. 

0 Karma

willadams
Contributor

I found a "dirty" way to then use the existing query and then run a join after to the additional data source

 

0 Karma

manjunathmeti
Champion

You can create a new field based on the field name and assign value only if a particular index name matches.
Try this,

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections) | eval win_last_activity=if(index="windows", last_activity_time, ""), work_last_activity=if(index="workstations", last_activity_time, "")

And use win_last_activity and work_last_activity fields for the next operations.

willadams
Contributor

I am testing using source instead, but trying to figure out an issue wherein it doesn't recognise the field name.  I confirmed this with the eval and instead of "" did a "0" and this returns the "0" not the actual values of that field.  

 

Will feedback once I work my way around it.  

0 Karma

willadams
Contributor

Thanks again @manjunathmeti.  Did consider the eval but was trying to do it in line with the multi-index search, however never considered to use eval with the if statement.  Will give it a crack and see what differences I have when my query has run.  

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!