Splunk Search

Fields with multiple values duplicating in managed lookup

bigchungusfan55
Explorer

I've been tasked with developing my organization's asset and identity lookups for Splunk ES.

I am using managed lookups to store the data and have a report run once a day to refresh the lookups with new data that may have been ingested from our AD index.

The issue I am running into is that every time the report runs, the values in my lookup tables are duplicating where there are multiple values stored. For example, a device may have the categories a and b, but the next time the report runs it is listed as having the categories a, b, and ab, and the following day a, b, and abab with this cycle repeating forever. I have attempted to use the dedup and mvdedup function between the inputlookup and outputlookup statements but it does not work. I have pasted a sanitized version of the identity lookup generator report's SPL below:

index=azuread sourcetype=azure:aad:user
| eval bunit=<bunit eval statement>
| eval alias=onPremisesSamAccountName
| eval priority=if(MATCH(userType,"Member"),"medium", priority)
| eval priority=if(MATCH(userType,"Guest"),"low", priority)
| eval category=if(MATCH(userType,"Member"), mvappend("Member",category), category)
| eval category=if(MATCH(userType,"Guest"), mvappend("Guest",category), category)
| eval category=if(LIKE(jobTitle,"%Chief%"), mvappend("Management",category), category)
| eval category=if(LIKE(jobTitle,"%Chief%"), mvappend("Executive",category), category)
| eval category=if(LIKE(jobTitle,"%Manager%"), mvappend("Management",category), category)
| eval identity=mvdedup(lower(mvappend(alias,mail)))
| replace null WITH ""
| inputlookup ad_identities append=t
| stats count values(category) as category, values(mail) as email, values(department) as department, values(displayName) as nick, values(jobTitle) as title, values(bunit) as bunit, values(mobilePhone) as phone, values(givenName) as first, values(surname) as last, values(city) as work_city, values(lastPasswordChangeDateTime) as lastPasswordChangeDateTime, values(alias) as alias, values(identity) as identity, values(priority) as priority by userPrincipalName
| eval bunit=mvdedup(bunit)
| eval category=mvdedup(category)
| fields - count
| outputlookup ad_identities

Any advice would be greatly appreciated.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

When you run outputlookup on a row with multivalue fields, the lookup will flatten the multi value fields to space (or comma - I forget) separated values. 

If you want the lookup file to preserve MV then you must use 

| outputlookup ad_identities output_format=splunk_mv_csv

which will preserve MV fields.

You also do not need to use stats count and then remove count. Just don't use count in your stats command

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

When you run outputlookup on a row with multivalue fields, the lookup will flatten the multi value fields to space (or comma - I forget) separated values. 

If you want the lookup file to preserve MV then you must use 

| outputlookup ad_identities output_format=splunk_mv_csv

which will preserve MV fields.

You also do not need to use stats count and then remove count. Just don't use count in your stats command

 

bigchungusfan55
Explorer

This worked! I just had to make sure to clear the fields that had duplicating issues in the managed lookup editor before running the report. Thanks!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...