Splunk Search
Highlighted

What are the best way to merge a set of values that refer to the same thing in a field?

Communicator

For example in a field "customer", I have the following events and values:
Event 1: abc
Event 2 :abc pte ltd

I want to merge their values to say "abc". Is it possible to do it programatically instead of a manual replace command for every occurrence?

Tags (3)
0 Karma
Highlighted

Re: What are the best way to merge a set of values that refer to the same thing in a field?

Esteemed Legend

It would really help if you majorly clarified your question with full details including exactly what is in what fields. I am assuming that field customer is a multivalued field and that you would like to see how many events go with each customer; you can do that like this:

... | stats count BY customer
0 Karma
Highlighted

Re: What are the best way to merge a set of values that refer to the same thing in a field?

Communicator

Hi there, heres the additional information:

Example Data:
Event 1: customer=Abc
Event 2: customer=Abc Pte Ltd
Event 3: customer=Abc Technologies

I will like to normalize all of them to Abc as they are actually the same entity. Different dept keyed in the same entity under different names.

If I do a stats count by customer, it will probably treat all 3 events as 3 different entities. My current solution is to do a replace "Abc Pte Ltd" with "Abc" in customer. Im just wondering if there are any solutions that can automatically do this that is more general so that I dont have to crawl through the entire list via say a stats values(customer) command to slowly add in the replace commands.

Hope this clears up my situation.

0 Karma
Highlighted

Re: What are the best way to merge a set of values that refer to the same thing in a field?

Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

View solution in original post

0 Karma
Highlighted

Re: What are the best way to merge a set of values that refer to the same thing in a field?

Communicator

Thank you so much! I didnt know splunk is able to generate an Output and append another Value!

0 Karma