I want to use the dedup command with more than one criteria.
First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?
dedup keepempty=t A B
My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.
A further question regarding the dedup command:
Let's say the fields A & B can appear multiple times in an event.
| dedup A,B,timestamp
does this include all field values for A & B and results in two remaining events (event 1 and event 3)?
Thanks in advance
Ah, now numbers are changing in the correct direction 🙂
And when I want to ignore events where the dedup criteria don't exist, I can just use
| dedup A,B
Thanks a lot!
yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?
Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events