Splunk Search

Identify predictor fields

eregon
Path Finder

Good morning fellow Splunkthiasts!

I have an index with 100k+ events per minute (all of them having the same sourcetype), approximately 100 fields are known in this dataset. Some of these events are duplicit, while others are unique. My aim is to understand the duplication and be able to explain what events exactly get duplicated.

I am detecting duplicities using this SPL:

 

index="myindex" sourcetype="mysourcetype" | eventstats count AS duplicates BY _time, _raw

 

Now I need to identify what fields or their combination make the difference, under what circumstances the event is ingested twice.

I tried to use predict command, however it is somehow producing new values for "duplicates" field, but it does not disclose the rule by which it makes the decision. In other words, I am not interested in prediction itself, I want to know the predictors.

Is something like that possible in SPL?

Labels (1)
0 Karma

jason_hotchkiss
Communicator
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...