Splunk Search

How to dedup multivalued fields?

UMDTERPS
Communicator

Some of the data coming in from one of our indexes is doing the following( It appears data is repeating for each field):


ip                                                            User                         System
192.168.1.1 192.168.1.1            BOB BOB             ABC ABC

How can I get the data so it only shows one field value per field? (how to get it to stop repeating the same data in each field)?

ip                                 User              System
192.168.1.1             BOB             ABC

Dedup obviously won't work in this instance. 
 

Labels (3)
Tags (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

You should be able to use replace+regex to change that line break to a space and then split/dedup on that, e.g.

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @UMDTERPS,

If fields values are multivalue, you can use below workaround for a few fields. 

| eval ip=mvindex(split(ip," "),0)
| eval User=mvindex(split(User," "),0)
| eval System=mvindex(split(System," "),0)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

UMDTERPS
Communicator

I'm still getting the same IP address repeated for each field when doing 

| eval ip=mvindex(split(ip," "),0)
| eval User=mvindex(split(User," "),0)
| eval System=mvindex(split(System," "),0)


ip
198.168.1.1
198.168.1.1


Weird. Wonder if it is something is off with the data?

0 Karma

scelikok
SplunkTrust
SplunkTrust

What if we do not split?

| eval ip=mvindex(ip,0)
| eval User=mvindex(User,0)
| eval System=mvindex(System,0)
If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

UMDTERPS
Communicator

So, we believe the data coming in from the indexer has some sort of line break and so "Spitting" the fields won't work.  I talked to another engineer at work and he said he may require a "Regex" statement.  I'll keep this thread updated. 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You should be able to use replace+regex to change that line break to a space and then split/dedup on that, e.g.

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

UMDTERPS
Communicator

This worked!

 

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

 

An engineer at work gave me this (yours is better):

 

|rex mode=sed "s/([0-9\.]+)\n.*/\1/g" field=ip

 

However, it only works for the ip field and you would have to create a custom regex for each field.  I will have to get with the admin to fix the data coming in.  Also, we had an issue with the data getting formatted in each field, where it made the data look like a giant column.  This was the fix:

 

|eval ip = replace(ip, "\n", " ")

 

0 Karma

scelikok
SplunkTrust
SplunkTrust

If you can provide a few sample events, we can help better.

If this reply helps you an upvote and "Accept as Solution" is appreciated.

FelixLeh
Contributor

Hey, 
I'm relatively new to Splunk so I don't know if there is a more elegant way to do this but the following code should work just fine:

| makemv ip
| makemv user
| makemv system
| mvexpand ip
| mvexpand user
| mvexpand system
| dedup user ip system

This should output a row for every combination in your source excluding the duplicates.
If the fields are already multivalue then you can skip all the "Makemv" lines!

0 Karma

UMDTERPS
Communicator

Unfortunately that does not work. 🙁

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Are you saying that the indexer has created a multivalue field with duplicate values in for some (or all?) of your events, or are these multivalue fields the result of a search query?

0 Karma

UMDTERPS
Communicator

That I'm not sure about, there could be an issues to how the data is getting in or out of the indexer.  I don't have admin rights (im not the admin), but this issue is preventing be from doing lookups and/or joins on the data with CSV's.  

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...

Splunk and Fraud

Watch Now!Watch an insightful webinar where we delve into the innovative approaches to solving fraud using the ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...