Splunk Search

Fields with multiple values duplicating in managed lookup

bigchungusfan55
Explorer

I've been tasked with developing my organization's asset and identity lookups for Splunk ES.

I am using managed lookups to store the data and have a report run once a day to refresh the lookups with new data that may have been ingested from our AD index.

The issue I am running into is that every time the report runs, the values in my lookup tables are duplicating where there are multiple values stored. For example, a device may have the categories a and b, but the next time the report runs it is listed as having the categories a, b, and ab, and the following day a, b, and abab with this cycle repeating forever. I have attempted to use the dedup and mvdedup function between the inputlookup and outputlookup statements but it does not work. I have pasted a sanitized version of the identity lookup generator report's SPL below:

index=azuread sourcetype=azure:aad:user
| eval bunit=<bunit eval statement>
| eval alias=onPremisesSamAccountName
| eval priority=if(MATCH(userType,"Member"),"medium", priority)
| eval priority=if(MATCH(userType,"Guest"),"low", priority)
| eval category=if(MATCH(userType,"Member"), mvappend("Member",category), category)
| eval category=if(MATCH(userType,"Guest"), mvappend("Guest",category), category)
| eval category=if(LIKE(jobTitle,"%Chief%"), mvappend("Management",category), category)
| eval category=if(LIKE(jobTitle,"%Chief%"), mvappend("Executive",category), category)
| eval category=if(LIKE(jobTitle,"%Manager%"), mvappend("Management",category), category)
| eval identity=mvdedup(lower(mvappend(alias,mail)))
| replace null WITH ""
| inputlookup ad_identities append=t
| stats count values(category) as category, values(mail) as email, values(department) as department, values(displayName) as nick, values(jobTitle) as title, values(bunit) as bunit, values(mobilePhone) as phone, values(givenName) as first, values(surname) as last, values(city) as work_city, values(lastPasswordChangeDateTime) as lastPasswordChangeDateTime, values(alias) as alias, values(identity) as identity, values(priority) as priority by userPrincipalName
| eval bunit=mvdedup(bunit)
| eval category=mvdedup(category)
| fields - count
| outputlookup ad_identities

Any advice would be greatly appreciated.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

When you run outputlookup on a row with multivalue fields, the lookup will flatten the multi value fields to space (or comma - I forget) separated values. 

If you want the lookup file to preserve MV then you must use 

| outputlookup ad_identities output_format=splunk_mv_csv

which will preserve MV fields.

You also do not need to use stats count and then remove count. Just don't use count in your stats command

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

When you run outputlookup on a row with multivalue fields, the lookup will flatten the multi value fields to space (or comma - I forget) separated values. 

If you want the lookup file to preserve MV then you must use 

| outputlookup ad_identities output_format=splunk_mv_csv

which will preserve MV fields.

You also do not need to use stats count and then remove count. Just don't use count in your stats command

 

bigchungusfan55
Explorer

This worked! I just had to make sure to clear the fields that had duplicating issues in the managed lookup editor before running the report. Thanks!

0 Karma
Get Updates on the Splunk Community!

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...

[Live Demo] Watch SOC transformation in action with the reimagined Splunk Enterprise ...

Overwhelmed SOC? Splunk ES Has Your Back Tool sprawl, alert fatigue, and endless context switching are making ...

What’s New & Next in Splunk SOAR

Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us on ...