Knowledge Management

stats dc behavior against a summary index

vbumgarner
Contributor

So I have a summary index that was populated hourly with something like:

sourcetype="foo" | sistats count dc(s) by d

I can then do this:

index="summary_foo_hourly" | stats dc(s)

but I cannot do this:

index="summary_foo_hourly" | stats dc(d)

nor this:

index="summary_foo_hourly" | stats dc(s) dc(d)

as dc(d) always returns zero.

Any reason this shouldn't work?

I can get around it like so:

index="summary_foo_hourly" | stats values(s) as s by d | stats dc(s) dc(d)

but that's kind of a drag.

Tags (2)
0 Karma

vbumgarner
Contributor

sistats maintains the original values, placing those original values in a special field that stats then understands later. Try it. It works.

The question is why won't stats perform a dc() on one of the "by" fields captured using sistats.

This post from 2011 is an old approach. These days you'd use an accelerated data model, though if the sistats produced a sufficiently small number of rows, it might still be faster than the accelerated data model.

0 Karma

woodcock
Esteemed Legend

You are WAY off track here! Think about what you are doing. Is it proper to do dc(dc(anything))? Once you do dc on anything that is rolled up into a summary index, the only way you can rollup again, is to do something like avg(dcField) or mean(dcField), etc. You cannot (with any valid output) take an hourly dc(users) and sum(hourlyUsers) for the last 24 hours and get a daily dc(users); it is a one-way ticket: once you go the dc route, you must not ever roll those values up again (even if SPL allows it).

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!