Re: Dedup with multiple criteria

HeinzWaescher · ‎01-15-2014

Hi,

I want to use the dedup command with more than one criteria.

First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?

Best

Heinz

landen99 · ‎04-28-2015

dedup keepempty=t A B
http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Dedup

My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.

HeinzWaescher · ‎01-17-2014

A further question regarding the dedup command:

Let's say the fields A & B can appear multiple times in an event.
For example:

Event 1:
A=1
A=2
B=3
B=4
timestamp=X

Event:2
A=1
A=2
B=3
B=4
timestamp=X

Event 3:
A=1
A=2
B=3
B=4
timestamp=Y

| dedup A,B,timestamp

does this include all field values for A & B and results in two remaining events (event 1 and event 3)?

Thanks in advance

Heinz

HeinzWaescher · ‎01-21-2014

thanks for confirming!

linu1988 · ‎01-17-2014

Yes it gives the value till you have something distinct with the above combination.

HeinzWaescher · ‎01-15-2014

Ah, now numbers are changing in the correct direction 🙂

And when I want to ignore events where the dedup criteria don't exist, I can just use

sourcetype=* AND
A=* AND
B=* AND

| dedup A,B

Thanks a lot!

Ayn · ‎01-15-2014

Then that's your problem there. You can do ... | fillnull B | ... if you want B with an empty value in events that don't have it. That will make dedup work.

HeinzWaescher · ‎01-15-2014

Hey Ayn,

yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?

Edit:

Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events

Ayn · ‎01-15-2014

Does B exist in all your events? IIRC dedup will fail otherwise.

Dedup with multiple criteria

October Community Champions: A Shoutout to Our Contributors!

Community Content Calendar, November Edition

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Are you a member of the Splunk Community?

Dedup with multiple criteria

October Community Champions: A Shoutout to Our Contributors!

Community Content Calendar, November Edition

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!