Splunk Search

What are the best way to merge a set of values that refer to the same thing in a field?

Stevelim
Communicator

For example in a field "customer", I have the following events and values:
Event 1: abc
Event 2 :abc pte ltd

I want to merge their values to say "abc". Is it possible to do it programatically instead of a manual replace command for every occurrence?

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

View solution in original post

0 Karma

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer
0 Karma

Stevelim
Communicator

Thank you so much! I didnt know splunk is able to generate an Output and append another Value!

0 Karma

woodcock
Esteemed Legend

It would really help if you majorly clarified your question with full details including exactly what is in what fields. I am assuming that field customer is a multivalued field and that you would like to see how many events go with each customer; you can do that like this:

... | stats count BY customer
0 Karma

Stevelim
Communicator

Hi there, heres the additional information:

Example Data:
Event 1: customer=Abc
Event 2: customer=Abc Pte Ltd
Event 3: customer=Abc Technologies

I will like to normalize all of them to Abc as they are actually the same entity. Different dept keyed in the same entity under different names.

If I do a stats count by customer, it will probably treat all 3 events as 3 different entities. My current solution is to do a replace "Abc Pte Ltd" with "Abc" in customer. Im just wondering if there are any solutions that can automatically do this that is more general so that I dont have to crawl through the entire list via say a stats values(customer) command to slowly add in the replace commands.

Hope this clears up my situation.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...