Splunk Search

Dedup duplicates

HeinzWaescher
Motivator

Hi,

what is the easiest way to filter out event duplicates without adding every field in the dedup command?
Is
| dedup _raw

the correct approach?

BR

Tags (2)
1 Solution

Ayn
Legend

dedup _raw should work just fine, yes.

View solution in original post

HeinzWaescher
Motivator

I've got two additional questions regarding this topic:

  1. How can I search for the count of events that have duplicates?
  2. How can I search for the total number of duplicates?

BR

Heinz

0 Karma

HeinzWaescher
Motivator

Unfortunately I don't have an unique identifier for each event like your proposed session_id

0 Karma

Rocket66
Communicator

You can count duplicated event by using the "transaction" command. And then count the events by using "eventcount"

eg.:

eventtype="*" | transaction session_id | Where eventcount>1 | stats count by eventcount

to find out how many duplicates occured

or:

eventtype="*" | transaction session_id | Where eventcount>1 | stats count(eventcount)

to count how many different duplicated events occured

or ...

0 Karma

Ayn
Legend

dedup _raw should work just fine, yes.

HeinzWaescher
Motivator

great, thanks

0 Karma

ITUser1
Explorer

When I try and enter the "|dedup _raw" command at the end of my search parameter I end up with no matches but when I take it off the end I end up with thousands. I can see that they are duplicates(same IP address, name, and port) but it still doesn't work. any suggestions?

0 Karma
Get Updates on the Splunk Community!

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...