Search based on word match


I have a search built off of a lookup file that generates a list of words. I'm looking for assistance with a search that would give me results if the field I'm searching on matches 3 or more of those words.



I'm looking to search a field in an index and populate results if it contains any 3 (or more) of the Words in the word list.

So you can see what this is all about, here's the search we want to end up with

 your base search  ( "word1" OR "word2" OR "word3" OR "word4" )
 | rex field=yourfield  "(?<MT>word1|word2|word3|word4)" max_match=0
 | eval MT = mvdedup(MT)
 | where mvcount(MT)>2    

And here are the steps to build it...

| makeresults | eval Words="word1 word2 word3 word4" | makemv Words| mvexpand Words | table Words
| rename COMMENT as "above generates test data, you would use something like | inputlookup mywords | table Words "

| rename COMMENT as "mark Word records as detail so we can process them twice and then delete them later"
| eval rectype="detail" 

| rename COMMENT as "this creates a field search1 that looks like this ( "word1" OR "word2" OR "word3" OR "word4" ) "
| appendpipe 
    [ | where rectype=="detail" 
      | table Words | format "(" "" "" "" "OR" ")" 
      | rex mode=sed field=search "s/ Words=//g" 
| eval rectype=coalesce(rectype,"search1")

| rename COMMENT as "this creates a field search2 that looks like this (?<MT>word1|word2|word3|word4) "
| appendpipe 
    [ | where rectype=="detail" 
      | table Words 
      | format "(?<MT>" "" "" "" "|" ")" 
      | rex mode=sed field=search "s/ Words=\"//g" 
      | rex mode=sed field=search "s/\"| //g" 
| eval rectype=coalesce(rectype,"search2")

| rename COMMENT as "this kills everything but search1 and search2 "
| eval rectype=coalesce(rectype,"other")
| where rectype!="detail" 
| eval {rectype} = search 
| fields - rectype search Words

| rename COMMENT as "and now we map your search"
| map search="your base search  $search1$ | rex field=yourfield  \\"$search2$\\" max_match=0 | eval MT = mvdedup(MT) | where mvcount(MT)>2" 

Other notes - map is extremely finnicky, so I'd suggest you start with narrowing your base search to a couple of days and using only the first ten words, making sure that there are a couple of events in those days that should match.

Make sure to escape any quotes in your base search... and I think it actually needs to be double-escaped to work, \\" because the map is somehow passed twice during parsing and execution. The bottom line, though, is to run a tiny test against a tiny time, and fiddle with the escaping and such until you get a valid result, then you can start adding back until you get the search you want.

If it doesn't work right off, then remove | where mvcount(MT)>2 for the initial testing, so you can see if you got any intermediate results at all.


Proving sample data will help

It will look something like this

... | stats count(Word) AS Word_Count | where Word_Count >2

Revered Legend

Can you provide sample search, sample data and corresponding mock output?

