Splunk Search

Is it possible to dedup by span?

the_wolverine
Champion

I'm try to chart some data using span=1d and was wondering if it possible to dedup data across a timerange with span?

For example, I want to dedup duplicate users in a single day, but I also want those users to show up in previous days when I'm charting over a week.

I'm guessing * | dedup user | timechart span=7d .. would eliminate users from showing up in day 2-7.

I hope that makes sense.

Tags (2)

sideview
SplunkTrust
SplunkTrust

* | timechart dc(user) span=7d

where dc means "distinct count of".

This will make timechart count the distinct users per bucket, and since the span argument is setting the bucket size to 7 days, in the end you'll be counting the distinct users in every 7 day period.

sideview
SplunkTrust
SplunkTrust

Probably, but you'll have to tell me more because that pseudo-search-syntax is pretty ambiguous. To take a wild guess and at least tell you something interesting -- you can use the bin command to bucket numeric quantities, and then use stats/chart/timechart to group by those bucketed values. ie "* | bin someNumericField span=100 | stats count over someNumericField" will yield a nice chart with "0-100", "100-200", "200-300" as the x-axis.

0 Karma

the_wolverine
Champion

I'll test this, thank you. Is there a way to chart top(field) limit=X with using a span?

0 Karma

kristian_kolb
Ultra Champion

Makes sense. Depending on what you ultimately want out of the logs, something like this could work;

...| stats values(user) by date_wday

or date_mday if that suits you better.


UPDATE:

or rather use;

... earliest=-X latest=-Y | timechart span=1d values(user)

Hope this helps,

Kristian

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!