Splunk Search

How to flag users whose search engine queries contain certain words stored in a lookup table?

shawngunnison
Engager

Hi everyone, 

 

I've seen a few posts on here and elsewhere that seem to detail the same issue I'm having, but none of the solutions do the trick for me. Any help is appreciated. 

The goal is to flag users whose search engine queries (fieldname searched_for) contain words stored in a lookup table. Because those words could occur anywhere in the search query, wildcard matching is needed.

 

I have a lookup table called keywords.csv. It contains two columns: 

keyword,classification

splunk,test classification

 

The first use of the lookup works as it should, showing only events with keyword match anywhere in searched_for:

 

 

 

| search
    [| inputlookup keywords.csv
    | eval searched_for="*".keyword."*"
    | fields searched_for
    | format]

 

 

 

 

Next step is enrich the remaining events with the classification, and then filter out all events without a classification as such:

 

 

 

| lookup keywords.csv keyword AS searched_for OUTPUT classification
| search classification=*

 

 

 

 

The problem is the above SPL only enriches events in which the keyword exactly matches searched_for. If I search in Google for "splunk", the events are enriched; If I search for "word splunk word", the event is not enriched.

Is there a way around this without using | lookup? Or am I doing something wrong here? I'm out of ideas. I've tried:

  • Prepending and appending * to the keyword in the lookup table (*splunk*)
  • Adding lookup definition with matchtype WILDCARD(searched_for)
  • Thought maybe the issue is due to searched_for being an evaluated field, so I changed the matchtype and SPL to the field "url". It is coming straight from the logs and contains the search query string. Still get no enrichment.
  • Deleted and re-created the lookup, definition, and matchtype.
Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

For efficiency reasons, WILDCARD(searched_for) only supports wildcard after some initial fixed characters, like splunk*, spl*nk, etc.  If you have a table 

keywordclassification
splunk*test classification
spl*nktest classification 2

with WILDCARD(keyword) in lookup definition and test the following

keyword
splunk
splonk
splunky
splash wonk
splunkie
splunked
splonking

You'll get these:

searched_for
classification
splunk
test classification
test classification 2
splonktest classification 2
splunkytest classification
splash wonktest classification 2
splunkietest classification
splunkedtest classification
splonking 

Here is the emulation for the above

 

| makeresults
| eval searched_for = mvappend("splunk", "splonk", "splunky", "splash wonk", "splunkie", "splunked", "splonking")
| mvexpand searched_for
| lookup keywords.csv keyword AS searched_for OUTPUT classification
| table searched_for classification

 

Hope this helps.

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

For efficiency reasons, WILDCARD(searched_for) only supports wildcard after some initial fixed characters, like splunk*, spl*nk, etc.  If you have a table 

keywordclassification
splunk*test classification
spl*nktest classification 2

with WILDCARD(keyword) in lookup definition and test the following

keyword
splunk
splonk
splunky
splash wonk
splunkie
splunked
splonking

You'll get these:

searched_for
classification
splunk
test classification
test classification 2
splonktest classification 2
splunkytest classification
splash wonktest classification 2
splunkietest classification
splunkedtest classification
splonking 

Here is the emulation for the above

 

| makeresults
| eval searched_for = mvappend("splunk", "splonk", "splunky", "splash wonk", "splunkie", "splunked", "splonking")
| mvexpand searched_for
| lookup keywords.csv keyword AS searched_for OUTPUT classification
| table searched_for classification

 

Hope this helps.

Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...