Splunk Search

How to dedup multivalued fields?

UMDTERPS
Communicator

Some of the data coming in from one of our indexes is doing the following( It appears data is repeating for each field):


ip                                                            User                         System
192.168.1.1 192.168.1.1            BOB BOB             ABC ABC

How can I get the data so it only shows one field value per field? (how to get it to stop repeating the same data in each field)?

ip                                 User              System
192.168.1.1             BOB             ABC

Dedup obviously won't work in this instance. 
 

Labels (3)
Tags (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

You should be able to use replace+regex to change that line break to a space and then split/dedup on that, e.g.

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @UMDTERPS,

If fields values are multivalue, you can use below workaround for a few fields. 

| eval ip=mvindex(split(ip," "),0)
| eval User=mvindex(split(User," "),0)
| eval System=mvindex(split(System," "),0)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

UMDTERPS
Communicator

I'm still getting the same IP address repeated for each field when doing 

| eval ip=mvindex(split(ip," "),0)
| eval User=mvindex(split(User," "),0)
| eval System=mvindex(split(System," "),0)


ip
198.168.1.1
198.168.1.1


Weird. Wonder if it is something is off with the data?

0 Karma

scelikok
SplunkTrust
SplunkTrust

What if we do not split?

| eval ip=mvindex(ip,0)
| eval User=mvindex(User,0)
| eval System=mvindex(System,0)
If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

UMDTERPS
Communicator

So, we believe the data coming in from the indexer has some sort of line break and so "Spitting" the fields won't work.  I talked to another engineer at work and he said he may require a "Regex" statement.  I'll keep this thread updated. 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You should be able to use replace+regex to change that line break to a space and then split/dedup on that, e.g.

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

UMDTERPS
Communicator

This worked!

 

| eval ip=mvdedup(split(replace(ip, "\n", " "), " "))

 

An engineer at work gave me this (yours is better):

 

|rex mode=sed "s/([0-9\.]+)\n.*/\1/g" field=ip

 

However, it only works for the ip field and you would have to create a custom regex for each field.  I will have to get with the admin to fix the data coming in.  Also, we had an issue with the data getting formatted in each field, where it made the data look like a giant column.  This was the fix:

 

|eval ip = replace(ip, "\n", " ")

 

0 Karma

scelikok
SplunkTrust
SplunkTrust

If you can provide a few sample events, we can help better.

If this reply helps you an upvote and "Accept as Solution" is appreciated.

FelixLeh
Contributor

Hey, 
I'm relatively new to Splunk so I don't know if there is a more elegant way to do this but the following code should work just fine:

| makemv ip
| makemv user
| makemv system
| mvexpand ip
| mvexpand user
| mvexpand system
| dedup user ip system

This should output a row for every combination in your source excluding the duplicates.
If the fields are already multivalue then you can skip all the "Makemv" lines!

0 Karma

UMDTERPS
Communicator

Unfortunately that does not work. 🙁

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Are you saying that the indexer has created a multivalue field with duplicate values in for some (or all?) of your events, or are these multivalue fields the result of a search query?

0 Karma

UMDTERPS
Communicator

That I'm not sure about, there could be an issues to how the data is getting in or out of the indexer.  I don't have admin rights (im not the admin), but this issue is preventing be from doing lookups and/or joins on the data with CSV's.  

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...