Solved: Re: How to use lookup with wildcards to filter eve...

pjb2160 · ‎02-05-2015

Hello,

I have two log sources (AD logs and approval logs) which I am performing a correlation on (via a join). Each log source has a group_name field.

I wish to use a lookup table to filter out events which match the values in the lookup table. However, the lookup table will have wild cards AND I wish to apply the filter after I have performed the join. My train of thought is that way I don't have to do the look up twice (once for each sub-search).

For example, my lookup table, called ad_group_exclusions, has the following values:

group_name (this is the column heading)
group-foo-d-*
group-bar-t-*

And I am thinking my query should look something like this:

index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ]
 | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id)
 | join type=outer common_group common_user
      [search index=approvals sourcetype=approval_logs
      | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw]
 | table common_group common_user _raw approval_logs_raw
 | where NOT ([ | inputlookup ad_group_exclusions | fields group_name ])

Obviously this is not working as I've tested it by populating the lookup with exact matches as well as the wild card values.

Any suggestions?

Cheers,
P

David · ‎02-05-2015

You should be able to do a normal wildcard lookup for exclusions and then filter on the looked up field.

Your lookup could look like this:

group_name,ShouldExclude
group-foo-d-*,Exclude
group-bar-t-*,Exclude

Your transforms would look like (note that you can't trigger the match_type parameter via the GUI):

[MyLookup]
filename=myfile
match_type=WILDCARD(group_name)

Your search would look like:

 index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ]
  | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id)
  | join type=outer common_group common_user
       [search index=approvals sourcetype=approval_logs
       | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw]
  | table common_group common_user _raw approval_logs_raw
  | lookup ad_group_exclusions group_name as common_group OUTPUT ShouldExclude
  | search NOT ShouldExclude="Exclude"
  | where NOT ([ | inputlookup ad_group_exclusions | fields group_name ])

An alternative option (which would probably be faster, and would be the only option had I thought of it before I wrote the above...) would be a to put the lookup in a subsearch at the start, and leverage Splunk's ability to take a "search" field from a subsearch and just add it to the normal search string. Your search would look like:

 index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] [|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"] 
  | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id)
  | join type=outer common_group common_user
       [search index=approvals sourcetype=approval_logs [|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"]
       | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw]

You do run the subsearch twice, but that should be much faster than pulling back unnecessary logs off disk. (There could be reasons this wouldn't work, in which case the first option above would, but I would try this first).

View solution in original post

David · ‎02-05-2015

You should be able to do a normal wildcard lookup for exclusions and then filter on the looked up field.

Your lookup could look like this:

group_name,ShouldExclude
group-foo-d-*,Exclude
group-bar-t-*,Exclude

Your transforms would look like (note that you can't trigger the match_type parameter via the GUI):

[MyLookup]
filename=myfile
match_type=WILDCARD(group_name)

Your search would look like:

 index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ]
  | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id)
  | join type=outer common_group common_user
       [search index=approvals sourcetype=approval_logs
       | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw]
  | table common_group common_user _raw approval_logs_raw
  | lookup ad_group_exclusions group_name as common_group OUTPUT ShouldExclude
  | search NOT ShouldExclude="Exclude"
  | where NOT ([ | inputlookup ad_group_exclusions | fields group_name ])

An alternative option (which would probably be faster, and would be the only option had I thought of it before I wrote the above...) would be a to put the lookup in a subsearch at the start, and leverage Splunk's ability to take a "search" field from a subsearch and just add it to the normal search string. Your search would look like:

 index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ] [|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"] 
  | eval common_group = lower(group_name) | eval common_user = lower(changed_user_id)
  | join type=outer common_group common_user
       [search index=approvals sourcetype=approval_logs [|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"]
       | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected) | eval approval_logs_raw = _raw]

You do run the subsearch twice, but that should be much faster than pulling back unnecessary logs off disk. (There could be reasons this wouldn't work, in which case the first option above would, but I would try this first).

pjb2160 · ‎02-08-2015

Thanks David, I can see the option whereby I run the sub-search twice is the better option so I will look to progress with this.

I have been playing around with this just using the one data source (active_directory) and it seems to work for the most part.

index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ]
[|inputlookup ad_group_exclusions | stats values(group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"]
| table group_name changed_user_id _raw

The new issue is how do I ensure the comparison for the sub-search NOT is done with no case sensitivity? Previously, in the join I've used the lower(...) command to ensure the join compares values in lower case but it's not clear how I can do so using your suggestion?

Thanks again,
P

pjb2160 · ‎02-08-2015

OK, good news, I've resolved the case sensitivity issue when searching only on the AD logs by updating the transforms.conf to:

[ad_group_exclusions]
filename = ad_group_exclusions.csv
case_sensitive_match = false
match_type = WILDCARD(ex_group_name)

added the following line to the props.conf:

LOOKUP-ad_group_exclusions = ad_group_exclusions ex_group_name AS group_name  OUTPUT ex_group_name, should_exclude

have updated the .csv to look like:

ex_group_name,should_exclude
 group-foo-d-*,exclude
 group-bar-t-*,exclude

And the resulting query is now:

index=active_directory sourcetype=ad_logs [| inputlookup ad_event_codes | fields event_id ]
[|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (group_name=" . mvjoin(search, " OR group_name=") . ")"]
| eval common_group = lower(group_name) | eval common_user = lower(changed_user_id) 
| table common_group common_user  _raw

I've updated the .csv file with a variety of values to test the filtering and it works well.

pjb2160 · ‎02-08-2015

I've now tried to apply it to the approval_logs only however the approval_logs have two multi-value fields in them. I previously removed the line in the query addressing this as i initially thought it not to be relevant. Here's my current query:

index=approvals sourcetype=approval_logs
[| makemv delim=";" groups_affected | makemv delim=";" users_affected | mvexpand groups_affected | mvexpand users_affected]
[|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (groups_affected=" . mvjoin(search, " OR groups_affected=") . ")"]
| eval common_group=lower(groups_affected) | eval common_user=lower(users_affected)
| table common_group common_user _raw

Also, I updated the props.conf file under the relevant stanza:

LOOKUP-ad_group_exclusions = ad_group_exclusions ex_group_name AS groups_affected OUTPUT ex_group_name, should_exclude

but the group exclusion doesn't do anything. I think it's because the multi-value field, groups_affected hasn't yet been made MV and expanded. It's my understanding that the look up will be run first so that won't match on any of the multi-value fields.

I have tried swapping parts of the search around but i have had no success, either the query breaks syntactically or yields no results at all.

Any additional help would be greatly appreciated.

pjb2160 · ‎02-11-2015

Sorted, I've used @David's first suggestion to filter out the non-relevant results at the end of it all:

 index=approvals sourcetype=approval_logs
 [| makemv delim=";" groups_affected | makemv delim=";" users_affected | mvexpand groups_affected | mvexpand users_affected]
 [|inputlookup ad_group_exclusions | stats values(ex_group_name) as search | eval search="NOT (groups_affected=" . mvjoin(search, " OR groups_affected=") . ")"]
 | eval common_group=lower(groups_affected) | eval common_user=lower(users_affected)
 | rex field=common_group "\s+(?<common_group>.*)"
 | rex field=common_user "\s+(?<common_user>.*)"
 | table common_group common_user _raw
   | lookup ad_group_exclusions ex_group_name AS common_group OUTPUT should_exclude
   | search NOT should_exclude="exclude"
   | where NOT ([ | inputlookup ad_group_exclusions | fields ex_group_name ])

I also added some regex to strip out any of the special characters prepended to each new entry resulting from the mvexpand command.

Cheers,
P

How to use lookup with wildcards to filter events

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?

How to use lookup with wildcards to filter events

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine