Splunk Search

Dedup lost data, a bug?

april_tao
New Member

For below search :

eventtype=MYTYPE [search eventtype=MYTYPE | sort 0 _time desc | dedup fieldX | return 1000 source]

Expect to return the latest source for 1 fieldX value.

In our data, we have over 10,000,000 events for the latest source with fieldX=A, fieldX=B, fieldX=C respectively.
Thus expect the search returns over 30,000,000 results.
However, it returns the results for fieldX=A only.

Question : is the search correctly written? If yes, is this a bug of dedup? Is there any limitation of dedup about the result size? If we use a smaller dataset, dedup works properly with the same search.

Tags (2)
0 Karma

lguinn2
Legend

The problem is that the subsearch has a limit - and I don't see that you need the subsearch at all. You also do not need the sort, Splunk returns events in reverse time order (newest first) by default.Try this

eventtype=MYTYPE | dedup fieldX

This will return the most recent event for each value of fieldX.

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...