Splunk Search

Merge using a special logic

twjack
Explorer

I need to merge the following examples from a multivalue field using a special logic. I have absolutely no idea how to do that and need some input here.

A logic could be: eval/Merge if next mv entry is in length +1

examples:

w
wh
wha
what
whats
whats a
whats ap
whats app

Should be: "whats app"

w
we
web
web.
web.de

Should be:
"web."
"web.de"

My current search from which the examples originate:

index=proxy url="*/search?q=*" url_domain="*.google.com"
| eval url = urldecode(url)
| rex field=url "q=(?P<search_terms>.*)"
| eval search_terms=replace(mvindex(split(search_terms,"&"),0),"\+"," ")
| eval search_engine="Google"
| stats earliest(_time) as earliest, latest(_time) as latest, values(search_terms) as search_terms by src search_engine
0 Karma

FrankVl
Ultra Champion

Your examples are nice and clean, but I cannot imagine your actual data will look like that and I don't really follow the logic you're trying to apply. What is it that you want to achieve with this in the end?

What happens when your set of search terms is this:

a
bb
ccc
dddd
whats
whats a
whats ap
whats app

Should those also all be merged into "whats app"?

And conceptually: what does the search term "we" have in common with the search term "web.de" that makes you want to merge them?

0 Karma

renjith_nair
Legend

Try this and let me know if it works. Sample output with other fields will help to fine tune

 index=proxy url="*/search?q=*" url_domain="*.google.com"
 | eval url = urldecode(url)
 | rex field=url "q=(?P<search_terms>.*)"
 | eval search_terms=replace(mvindex(split(search_terms,"&"),0),"\+"," ")
 | eval search_engine="Google"
 | stats earliest(_time) as earliest, latest(_time) as latest, values(search_terms) as search_terms by src search_engine
 | eval total=mvcount(search_terms)
 | mvexpand search_terms|eval length=len(search_terms)| delta length as difference| fillnull value=0
 | streamstats count by src,search_engine reset_on_change=true |where difference > 1 OR count==total
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...