Splunk Search

What are the best way to merge a set of values that refer to the same thing in a field?

Stevelim
Communicator

For example in a field "customer", I have the following events and values:
Event 1: abc
Event 2 :abc pte ltd

I want to merge their values to say "abc". Is it possible to do it programatically instead of a manual replace command for every occurrence?

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

View solution in original post

0 Karma

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer
0 Karma

Stevelim
Communicator

Thank you so much! I didnt know splunk is able to generate an Output and append another Value!

0 Karma

woodcock
Esteemed Legend

It would really help if you majorly clarified your question with full details including exactly what is in what fields. I am assuming that field customer is a multivalued field and that you would like to see how many events go with each customer; you can do that like this:

... | stats count BY customer
0 Karma

Stevelim
Communicator

Hi there, heres the additional information:

Example Data:
Event 1: customer=Abc
Event 2: customer=Abc Pte Ltd
Event 3: customer=Abc Technologies

I will like to normalize all of them to Abc as they are actually the same entity. Different dept keyed in the same entity under different names.

If I do a stats count by customer, it will probably treat all 3 events as 3 different entities. My current solution is to do a replace "Abc Pte Ltd" with "Abc" in customer. Im just wondering if there are any solutions that can automatically do this that is more general so that I dont have to crawl through the entire list via say a stats values(customer) command to slowly add in the replace commands.

Hope this clears up my situation.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...