Splunk Search
Highlighted

Dedup within a MV field

Explorer

I need the ability to dedup a multi-value field on a per event basis. Something like values() but limited to one event at a time. The ordering within the mv doesn't matter to me, just that there aren't duplicates. Any help is greatly appreciated.

My search:

host=test* | transaction Customer maxspan=3m | eval logSplit = split(_raw,",") | eval eventSplit = mvfilter(match(logSplit, "^[E|e]vent-")) | table eventSplit

Normal output:

event-001 = date:02/14/2013 12:48:09 -0500|result:available_retrieve_success
event-002 = date:02/14/2013 12:48:10 -0500|result:scan_success|token:uf
event-003 = date:02/14/2013 12:48:11 -0500|result:retrieve_success|txType:P|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad
event-001 = date:02/14/2013 12:48:09 -0500|result:available_retrieve_success
event-002 = date:02/14/2013 12:48:10 -0500|result:scan_success|token:uf
event-001 = date:02/13/2013 12:49:20 -0500|result:log_success
event-003 = date:02/14/2013 12:48:11 -0500|result:retrieve_success|txType:P|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad
event-001 = date:02/14/2013 12:48:16 -0500|result:p_success|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad|total:6.1
event-001 = date:02/14/2013 12:48:16 -0500|result:p_success|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad|total:6.1

Preferred output:

event-001 = date:02/14/2013 12:48:09 -0500|result:available_retrieve_success
event-002 = date:02/14/2013 12:48:10 -0500|result:scan_success|token:uf
event-001 = date:02/13/2013 12:49:20 -0500|result:log_success
event-003 = date:02/14/2013 12:48:11 -0500|result:retrieve_success|txType:P|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad
event-001 = date:02/14/2013 12:48:16 -0500|result:p_success|txRefId:c0544ec1-bce5-4c4e-bc9d-f6e9072131ad|total:6.1

Tags (1)
Highlighted

Re: Dedup within a MV field

SplunkTrust
SplunkTrust

You could make use of the regular dedup like this:

...  | streamstats count | mvexpand eventSplit | dedup count eventSplit | mvcombine eventSplit | fields - count

View solution in original post

Highlighted

Re: Dedup within a MV field

Explorer

Thanks to both of you as these both worked to a certain degree. The stats weird trick did some strangeness to the output so I ended up using the mvexpand/mvcombine approach along with eventstats.

Much appreciated!

0 Karma
Highlighted

Re: Dedup within a MV field

SplunkTrust
SplunkTrust

Another idea is to use stats values(), but do a weird trick to make it calculate unique values only within each row.

| streamstats count as row_number | stats values(mvField) as mvField by row_number | fields - row_number
Highlighted

Re: Dedup within a MV field

Motivator

I know this is an old question, but I stumbled upon this while trying to do the same thing, and there is now a much cleaner solution:

eval mvfield=mvdedup(mvfield)
Highlighted

Re: Dedup within a MV field

Builder

I ran into this need today and stumbled across this post...

It's worth noting for anyone else who finds this post while trying to figure out how to do this that <code>mvdedup</code> was only introduced in 6.2.0.

0 Karma
Highlighted

Re: Dedup within a MV field

Explorer

Exactly what I was looking for.

Love this community.