Archive

Search based on word match

kmcaloon
Explorer

I have a search built off of a lookup file that generates a list of words. I'm looking for assistance with a search that would give me results if the field I'm searching on matches 3 or more of those words.

Example:

Word
one
two
three
four
five

I'm looking to search a field in an index and populate results if it contains any 3 (or more) of the Words in the word list.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

So you can see what this is all about, here's the search we want to end up with

 your base search  ( "word1" OR "word2" OR "word3" OR "word4" )
 | rex field=yourfield  "(?<MT>word1|word2|word3|word4)" max_match=0
 | eval MT = mvdedup(MT)
 | where mvcount(MT)>2    

And here are the steps to build it...

| makeresults | eval Words="word1 word2 word3 word4" | makemv Words| mvexpand Words | table Words
| rename COMMENT as "above generates test data, you would use something like | inputlookup mywords | table Words "

| rename COMMENT as "mark Word records as detail so we can process them twice and then delete them later"
| eval rectype="detail" 

| rename COMMENT as "this creates a field search1 that looks like this ( "word1" OR "word2" OR "word3" OR "word4" ) "
| appendpipe 
    [ | where rectype=="detail" 
      | table Words | format "(" "" "" "" "OR" ")" 
      | rex mode=sed field=search "s/ Words=//g" 
      ] 
| eval rectype=coalesce(rectype,"search1")

| rename COMMENT as "this creates a field search2 that looks like this (?<MT>word1|word2|word3|word4) "
| appendpipe 
    [ | where rectype=="detail" 
      | table Words 
      | format "(?<MT>" "" "" "" "|" ")" 
      | rex mode=sed field=search "s/ Words=\"//g" 
      | rex mode=sed field=search "s/\"| //g" 
      ] 
| eval rectype=coalesce(rectype,"search2")


| rename COMMENT as "this kills everything but search1 and search2 "
| eval rectype=coalesce(rectype,"other")
| where rectype!="detail" 
| eval {rectype} = search 
| fields - rectype search Words

| rename COMMENT as "and now we map your search"
| map search="your base search  $search1$ | rex field=yourfield  \\"$search2$\\" max_match=0 | eval MT = mvdedup(MT) | where mvcount(MT)>2" 

Other notes - map is extremely finnicky, so I'd suggest you start with narrowing your base search to a couple of days and using only the first ten words, making sure that there are a couple of events in those days that should match.

Make sure to escape any quotes in your base search... and I think it actually needs to be double-escaped to work, \\" because the map is somehow passed twice during parsing and execution. The bottom line, though, is to run a tiny test against a tiny time, and fiddle with the escaping and such until you get a valid result, then you can start adding back until you get the search you want.

If it doesn't work right off, then remove | where mvcount(MT)>2 for the initial testing, so you can see if you got any intermediate results at all.

skoelpin
SplunkTrust
SplunkTrust

Proving sample data will help

It will look something like this

... | stats count(Word) AS Word_Count | where Word_Count >2

0 Karma

somesoni2
Revered Legend

Can you provide sample search, sample data and corresponding mock output?

0 Karma