Splunk Search

## Dedup with multiple criteria

Motivator

Hi,

I want to use the dedup command with more than one criteria.

First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?

Best

Heinz

Tags (1)
Motivator

dedup keepempty=t A B
http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Dedup

My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.

Motivator

A further question regarding the dedup command:

Let's say the fields A & B can appear multiple times in an event.
For example:

Event 1:
A=1
A=2
B=3
B=4
timestamp=X

Event:2
A=1
A=2
B=3
B=4
timestamp=X

Event 3:
A=1
A=2
B=3
B=4
timestamp=Y

``````| dedup A,B,timestamp
``````

does this include all field values for A & B and results in two remaining events (event 1 and event 3)?

Heinz

Motivator

thanks for confirming!

Champion

Yes it gives the value till you have something distinct with the above combination.

Motivator

Ah, now numbers are changing in the correct direction ๐

And when I want to ignore events where the dedup criteria don't exist, I can just use

sourcetype=* AND
A=* AND
B=* AND

| dedup A,B

Thanks a lot!

Legend

Then that's your problem there. You can do `... | fillnull B | ...` if you want B with an empty value in events that don't have it. That will make dedup work.

Motivator

Hey Ayn,

yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?

Edit:

Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events

Legend

Does B exist in all your events? IIRC dedup will fail otherwise.