Splunk Search

Another Noobie question: how to search for terms / keywords that are "close" to each other?

ks5752
Engager

strong textHi,

I've been searching around the forum and have been unable to find any guidance on this question. I figure that I am either approaching the task incorrectly, or I am not using the right search keywords.... anywho.... here goes with the question...

I'd like to be able to find records that have specific keywords that are within close proximity of each other. For instance with the phrase below...

WHEN IN THE COURSE OF HUMAN EVENTS IT BECOMES NECESSARY FOR ONE PEOPLE TO DISSOLVE

and

WHEN THE EVENTS ARE POSTED AND THE PEOPLE REACT, IT BECOMES A MESS

when trying to match on the keywords of EVENTS and BECOMES I would like to be able to control the return of information based on how close these two keywords are to each other...

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS WITHIN 2 WORDS OF "BECOMES"

that way, the query would identify the first phrase and not the second phrase...

or

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS GREATER THAN 3 WORDS APART FROM "BECOMES"

that variant would then identify the second phrase....

Any guidance would be appreciated.

Thanks.

Tags (1)
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

To quote xkcd... stand back, I know regular expressions!

... | regex _raw="EVENTS(\W+\w+){0,2}\W*BECOMES"

... | regex _raw="EVENTS(\W+\w+){3,}\W*BECOMES"

The first one allows for zero to two words between the two keywords, the second requires at least three.

gkanapathy
Splunk Employee
Splunk Employee

Oh, it is quite important to add that the tokens "EVENTS" and "BECOMES" should be in the base search, or the search will be wastefully inefficient. So e.g.:

sourcetype=presidential_speeches EVENTS BECOMES | regex ...

otherwise you will retrieve events without the words you're interested in, and have to filter them with regex. By specifying the terms in the base search, you will avoid retrieving events/documents that don't contain the terms of interest at all. This is very similar really to how Splunk does phrase and key-value searching.

gkanapathy
Splunk Employee
Splunk Employee

The final \W* should probably be \w+ to avoid the last work being merged into BECOMES. Also, for text searches in general, probably makes sense to specify (?i)

0 Karma

yannK
Splunk Employee
Splunk Employee

brilliant.

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...