Splunk Search

tstats with count() works but dc() produces 0 results

thisissplunk
Builder

I'm using tstats on an accelerated data model which is built off of a summary index. Everything works as expected when querying both the summary index and data model except for an exceptionally large environment that produces 10-100x more results when running dc().

 

This works fine in said environment and produces 17,000,000~:

 

| tstats summariesonly=true count(assets.hostname) from datamodel="Summary_Host_Data" where (earliest=-1d latest=now)

 

This produces 0 results, which should be around 400,000~:

 

| tstats summariesonly=true dc(assets.hostname) from datamodel="Summary_Host_Data" where (earliest=-1d latest=now)

 

Even though the summary index works fine and produces 400,000~:

 

index=summary_host_data earliest=-1d | stats dc(hostname)

 

Finally, if I search over 6 hours instead of 1d, I do get results from the tstats using dc().

Is there some type of limit I'm running into with dc()? Or is there something else going on?

Labels (2)
Tags (1)
0 Karma

PradReddy
Path Finder

Hi thisissplunk,

Tstats search syntax seems correct and able to get a valid output for distinct_count on my end.
To my understanding there is no limitations for distinct_count aggregate function.

When you enable acceleration for a data model, Splunk software builds the initial set of .tsidx file summaries for the data model and then runs scheduled searches in the background every 5 minutes to keep those summaries up to date. Each update ensures that the entire configured time range is covered without a significant gap in data. This method of summary building also ensures that late-arriving data is summarized without complication.

Can you please verify DM accelerations searches executions status using below search 

index=_internal sourcetype="scheduler" savedsearch_id="<user>;<appname>;_ACCELERATE_DM_<appname>_<DataModelName>_ACCELERATE_"


------

An upvote would be appreciated and Accept Solution if it helps!

thisissplunk
Builder

Thanks for responding. I've checked for those logs and they all return "success" for the data model acceleration queries.

What I think is going on is that I'm running into some kind of memory error. This is reinforced by:

  1. estdc() working but dc() not
  2. dc() working on smaller timeframes (6 hours and below)

I just can't figure out where or why.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@thisissplunk 

Have you looked at the job inspector to see if that gives any clues, also have a look at the search log.

You can also get more information in the search log by enabling debug

Have a look at Clara Merriman's great article on the job inspector, which also gives the info on where to add changes to limits.conf to get extra debug to the log.

https://www.splunk.com/en_us/blog/tips-and-tricks/splunk-clara-fication-job-inspector.html

 

0 Karma

yotamros
Explorer

Did you end up finding a solution to this? I have a similar care and there is no info anywhere on the matter..

0 Karma

thisissplunk
Builder

As far as I could tell, it was some type of silent memory or data limit. I got away with using estdc() instead, since 100% accuracy wasn't required for my use case.

You can try limiting the time frame or amount of events and see where it starts breaking with dc().

I'm not sure how to fix the issue. Maybe a config limit or just more memory on the server.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...