Hi everyone,
I've seen a few posts on here and elsewhere that seem to detail the same issue I'm having, but none of the solutions do the trick for me. Any help is appreciated.
The goal is to flag users whose search engine queries (fieldname searched_for) contain words stored in a lookup table. Because those words could occur anywhere in the search query, wildcard matching is needed.
I have a lookup table called keywords.csv. It contains two columns:
keyword,classification
splunk,test classification
The first use of the lookup works as it should, showing only events with keyword match anywhere in searched_for:
| search
[| inputlookup keywords.csv
| eval searched_for="*".keyword."*"
| fields searched_for
| format]
Next step is enrich the remaining events with the classification, and then filter out all events without a classification as such:
| lookup keywords.csv keyword AS searched_for OUTPUT classification
| search classification=*
The problem is the above SPL only enriches events in which the keyword exactly matches searched_for. If I search in Google for "splunk", the events are enriched; If I search for "word splunk word", the event is not enriched.
Is there a way around this without using | lookup? Or am I doing something wrong here? I'm out of ideas. I've tried:
For efficiency reasons, WILDCARD(searched_for) only supports wildcard after some initial fixed characters, like splunk*, spl*nk, etc. If you have a table
keyword | classification |
splunk* | test classification |
spl*nk | test classification 2 |
with WILDCARD(keyword) in lookup definition and test the following
keyword |
splunk |
splonk |
splunky |
splash wonk |
splunkie |
splunked |
splonking |
You'll get these:
searched_for | classification |
splunk | test classification test classification 2 |
splonk | test classification 2 |
splunky | test classification |
splash wonk | test classification 2 |
splunkie | test classification |
splunked | test classification |
splonking |
Here is the emulation for the above
| makeresults
| eval searched_for = mvappend("splunk", "splonk", "splunky", "splash wonk", "splunkie", "splunked", "splonking")
| mvexpand searched_for
| lookup keywords.csv keyword AS searched_for OUTPUT classification
| table searched_for classification
Hope this helps.
For efficiency reasons, WILDCARD(searched_for) only supports wildcard after some initial fixed characters, like splunk*, spl*nk, etc. If you have a table
keyword | classification |
splunk* | test classification |
spl*nk | test classification 2 |
with WILDCARD(keyword) in lookup definition and test the following
keyword |
splunk |
splonk |
splunky |
splash wonk |
splunkie |
splunked |
splonking |
You'll get these:
searched_for | classification |
splunk | test classification test classification 2 |
splonk | test classification 2 |
splunky | test classification |
splash wonk | test classification 2 |
splunkie | test classification |
splunked | test classification |
splonking |
Here is the emulation for the above
| makeresults
| eval searched_for = mvappend("splunk", "splonk", "splunky", "splash wonk", "splunkie", "splunked", "splonking")
| mvexpand searched_for
| lookup keywords.csv keyword AS searched_for OUTPUT classification
| table searched_for classification
Hope this helps.