Splunk Search

Any way to filter multiple wildcard lookup matches to narrowest match?

yuanliu
SplunkTrust
SplunkTrust

If a value matches multiple rows due to wildcard, I want a method to return only one match that is "narrowest".  Is there a way to construct lookup Filter?

The use case is like the following.  Given a wildcard lookup (wildlookup) on matchfield

matchfieldfield1field2
abcdefgmatchabc7match
abcdef*matchabc6match
abcde*matchabc5match
abc*matchabcbroadmatch

The default behavior (without lookup filter) will be

'matchfield'field1field2
abcdefgh

matchabc

matchabc

matchabc

6match

5match

broadmatch

abcd

matchabc

broadmatch
abcde

matchabc

matchabc

5match

broadmatch

abcdef

matchabc

matchabc

matchabc

6match

5match

broadmatch

Because I organized my lookup table such that the narrowest match is the first match by row, I can do

 

| eval field1 = mvindex(field1, 0), field2 = mvindex(field2, 0)

 

But then I have to do this every time I use this lookup.  Lookup filter says

Filter results from the lookup table before returning data. Create this filter like you would a typical search query using Boolean expressions and/or comparison operators.

Obviously mvindex is not Boolean nor a comparison. How do I set up a filter to do this?  More broadly, if there is a filter so I do not have to manually organize my lookup, it would be even better.

Labels (1)
Tags (1)
0 Karma
1 Solution

VatsalJagani
SplunkTrust
SplunkTrust

@yuanliu - Use max_matches=1 parameter in your transforms.conf for your lookup stanza.

* If you don't have lookup stanza in transforms.conf already, add.

 

I hope this helps!!! Kindly upvote if it does!!!

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

I don't think you can "order" the lookup results using Splunk built-in functionality.

It's a similar case as often appearing question of filtering events on ingest based on  other event's value. In other words - keeping state.

Lookup filter allows only filtering whether this particular lookup result row fits the criteria or not. You can't check if it's the first one, last one or whatever.

So as @VatsalJagani said - reorganize your lookup (but that can be troublesome), write a custom command (even more so, I suppose). Or you could pack your lookup in a macro but that would also require some bending over backwards either with mv eval operations or with mvexpand|where. Seems ugly.

VatsalJagani
SplunkTrust
SplunkTrust

@yuanliu - There are two ways:

1. Organize the lookup from larger values for the narrowest match to smaller values. Keep the wild card on the top.

     -> As you mentioned, this requires manually organizing the lookup.

 

2. Use Python-based custom command: 

    -> I'm not including commands.conf or full code for Python, something I guess you already know of find from Splunk docs. Just included the logic part.

 

'''
This function matches the wildcard and returns how many characters matches exactly
In your case, it would help you find the narrowest match from the lookup
'''
def wildcard_match(text, match_string):
    # Split the match string by *
    parts = match_string.split('*')
    
    # Initialize variables
    match_count = 0
    text_index = 0
    
    # Iterate over the parts
    for part in parts:
        if part:
            # Find the index of the current part in the text
            index = text.find(part, text_index)
            if index == -1:
                return match_count
            else:
                match_count += len(part)
                text_index = index + len(part)
    
    return match_count


'''
And you can wrap the above function in a loop over the whole lookup to find which one matches the closest.
This returns (<match-from-lookup>, <X number'th entry matches the narrowest>)
'''
def find_best_match_from_lookup(match_strings, text):
    best_match_string = None
    highest_result = 0
    
    for index, match_string in enumerate(match_strings):
        result = wildcard_match(text, match_string)
        if result > highest_result:
            highest_result = result
            best_match_string = match_string
    
    return best_match_string, index


# And from here you can use that index to return whatever you need to return

 

I hope this helps!!! Kindly upvote if it does!!!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Thanks for the suggestions.  I need to clarify the requirement better.

  • Re ordering the rows, I can already do it.  The question is how to return a single row directly from lookup command without extra SPL. (As I illustrated, wildcard lookup will return multivalue fields when there are multiple matches.)
  • Re custom command, that goes outside of lookup.  It basically shifts the design (and maintenance) burden from Splunk to the user (me in this case).

 

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@yuanliu - Use max_matches=1 parameter in your transforms.conf for your lookup stanza.

* If you don't have lookup stanza in transforms.conf already, add.

 

I hope this helps!!! Kindly upvote if it does!!!

 

yuanliu
SplunkTrust
SplunkTrust

Doh! How come I didn't see this option right in the UI. (No need to access .conf)

lookup-adv-options.png

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...