Solved: What are the best way to merge a set of values tha...

Stevelim · ‎07-17-2015

For example in a field "customer", I have the following events and values:
Event 1: abc
Event 2 :abc pte ltd

I want to merge their values to say "abc". Is it possible to do it programatically instead of a manual replace command for every occurrence?

woodcock · ‎07-17-2015

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

View solution in original post

woodcock · ‎07-17-2015

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

Stevelim · ‎07-17-2015

Thank you so much! I didnt know splunk is able to generate an Output and append another Value!

woodcock · ‎07-17-2015

It would really help if you majorly clarified your question with full details including exactly what is in what fields. I am assuming that field customer is a multivalued field and that you would like to see how many events go with each customer; you can do that like this:

... | stats count BY customer

Stevelim · ‎07-17-2015

Hi there, heres the additional information:

Example Data:
Event 1: customer=Abc
Event 2: customer=Abc Pte Ltd
Event 3: customer=Abc Technologies

I will like to normalize all of them to Abc as they are actually the same entity. Different dept keyed in the same entity under different names.

If I do a stats count by customer, it will probably treat all 3 events as 3 different entities. My current solution is to do a replace "Abc Pte Ltd" with "Abc" in customer. Im just wondering if there are any solutions that can automatically do this that is more general so that I dont have to crawl through the entire list via say a stats values(customer) command to slowly add in the replace commands.

Hope this clears up my situation.

What are the best way to merge a set of values that refer to the same thing in a field?

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

Are you a member of the Splunk Community?

What are the best way to merge a set of values that refer to the same thing in a field?

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?