Splunk Search
Highlighted

Applying "dedup" in a rolling time window?

Path Finder

I want to deduplicate some events within a time period, but it's a rolling 24-hour frame so I can't just go off of one of the date fields. The only way I've figured out so far is to use transaction, but a transaction is a very expensive operation for something as trivial as this, and I then also have the processing & cognitive overhead of selecting the correct values from the multi-valued fields I end up with.

Tags (2)
0 Karma
Highlighted

Re: Applying "dedup" in a rolling time window?

Legend

I'm not sure I got your requirement right, so let me know if I misinterpreted your question. As I understand it you have a search running for a rolling 24-hour frame, and you want to make sure that certain events do not show up more than once - not sure with regards to what though, time and some value for a specific field? Anyway, if that's the case, let's say you want to have only unique values for the field myfield within some chosen time period, say one hour. I imagine this would do the trick:

... | bucket _time span=1h | dedup myfield _time
0 Karma
Highlighted

Re: Applying "dedup" in a rolling time window?

Path Finder

That's pretty much my problem, yes, but the problem is that the time period is rolling. My understanding is that if I aggregate events into a bucket, events in the last 5 minutes of the bucket would not be deduplicated against events in the first 5 minutes of the subsequent bucket. At the moment I'm doing transaction mykey maxspan=24.

0 Karma
Highlighted

Re: Applying "dedup" in a rolling time window?

Legend

Well yes, what bucket does is precisely to divide the time into discrete sets of buckets, so an event either ends up in one bucket (with regards to _time) or in another. Off the top of my head I don't know a way to handle this kind of situation (other than using transaction).

0 Karma