Splunk Search

Multiple Stats Multiple Indexes

willadams
Contributor

I have 3 data sets that I need to combine with 1 data set not having a field to perform a compare.  I initially started with a join but moved away from that given the limited results returned and also to save on processing time of running the query.  My data sets are:

Users Index

index=users sourcetype=userlist

This has the following fields

user_id
user_title
user_name

 

Workstations Index

index=workstations sourcetype=machines

This has the following fields

pc_id
pc_type
user_name

 

I can then do a stats instead of join on this data using user_name as the "join".  So my query is as follows (note a user can have more than 1 PC hence the mvexpand to break into individual entries)

 

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines)
| stats values(*) as * by user_name
| mvexpand pc_id
| table pc_id pc_type user_name

 

I now want to take the information I have from this search with various other filters from another source (for example 

 

index=windows sourcetype=connections which has lots of fields but I am only interested in the following

device_name
last_activity_time

 

I want to be able to match the results from my combined search and extend this to check against the 3rd index, matching on device_name against pc_id.  I am thinking that my query might add another statement into the first line of my SPL for example 

(i.e. index=users ...) OR (index=workstations .....) OR (index=windows | fields device_name last_activity_time)
| stats values(*) as * by user_name

 

however I don't think this will work as user_name is not a field in the "windows" index.  I am not sure how to now extend this to then look into another index without then doing a join. 

 

Labels (2)
0 Karma
1 Solution

manjunathmeti
Champion

hi @willadams,
Use coalesce function on fields pc_id and device_name, then do stats on pc_id and user_name.

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR index=windows sourcetype=connections 
| fields user_id, user_title, user_name, pc_id, pc_type, device_name, last_activity_time
| eval pc_id=coalesce(pc_id, device_name)
| stats values(*) as * by pc_id, user_name
| mvexpand pc_id

View solution in original post

0 Karma

manjunathmeti
Champion

hi @willadams,
Use coalesce function on fields pc_id and device_name, then do stats on pc_id and user_name.

(index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR index=windows sourcetype=connections 
| fields user_id, user_title, user_name, pc_id, pc_type, device_name, last_activity_time
| eval pc_id=coalesce(pc_id, device_name)
| stats values(*) as * by pc_id, user_name
| mvexpand pc_id
0 Karma

willadams
Contributor

@manjunathmeti thanks.  I have run into a problem though and realised that 2 of my indexers have the same field "names" in the index but different values.  Do you know if there is anyway to do a rename of the fields before stats based on the source of the data and then to do values as post processing?  For example

 

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections | rename last_activity_time AS win_last_activity)

 

This will generate a error in search (unable to parse the search: unbalanced parentheses). I have tried adjusting to something similar such as

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections [| rename last_activity_time AS win_last_activity])

or even

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections (| rename last_activity_time AS win_last_activity))

but have no luck so far. The only solution so far would suggest a change to my data source which would be a massive undertaking.  My only other thoughts may be to have a lookup table done from my last data set and change the values and then update this query using something like 

 

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (lookup pc_activity_log.csv)

 

but would much prefer to keep it in a single query if at all possible. 

0 Karma

willadams
Contributor

I found a "dirty" way to then use the existing query and then run a join after to the additional data source

 

0 Karma

manjunathmeti
Champion

You can create a new field based on the field name and assign value only if a particular index name matches.
Try this,

index=users sourcetype=userlist) OR (index=workstations sourcetype=machines) OR (index=windows sourcetype=connections) | eval win_last_activity=if(index="windows", last_activity_time, ""), work_last_activity=if(index="workstations", last_activity_time, "")

And use win_last_activity and work_last_activity fields for the next operations.

willadams
Contributor

I am testing using source instead, but trying to figure out an issue wherein it doesn't recognise the field name.  I confirmed this with the eval and instead of "" did a "0" and this returns the "0" not the actual values of that field.  

 

Will feedback once I work my way around it.  

0 Karma

willadams
Contributor

Thanks again @manjunathmeti.  Did consider the eval but was trying to do it in line with the multi-index search, however never considered to use eval with the if statement.  Will give it a crack and see what differences I have when my query has run.  

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...