Reporting

How long does it take that data is accelerated in accelereated data models.

Splunk Employee
Splunk Employee

Hello,

from the documentation i can read in section "how Splunk enterprise builds data model acceleration summaries" that the scheduled search is executed every 5 minutes.

what data will be accelerated every 5 minutes? will it be earliest=now or earliest=-1m@m or something different? i did several test's and the nearest one i found was about 2 minutes. i used the splunk audit model. however if someone can let me know what's behind that - it would be great.

i want to find out what the max. time to acceleration is.

thanks a lot
matthias

1 Solution

Motivator

TL;DR: "1 Day" summary range data model acceleration found to run every 5 minutes on the 0 and 5 minutes, -1d to now.


I created a test data model testdm in the launcher app, shared it, accelerated it for 1d, then queried Splunk about it over REST:

| rest /services/search/jobs count=0 splunk_server=local 
| search label="*testdm*" NOT label="| rest *" | head 1 | transpose

Here are some interesting fields returned:

author                   splunk-system-user
defaultTTL               600
eai:acl.app              launcher
eai:acl.sharing          global
eai:acl.ttl              600
earliestTime             2015-01-13T10:40:00.000+00:00
isSavedSearch            1
label                    _ACCELERATE_DM_launcher_testdm_ACCELERATE_
latestTime               2015-01-14T10:40:00.000+00:00
published                2015-01-14T10:40:01.000+00:00
request.earliest_time    -1d
request.latest_time      now
sid                      scheduler__nobody__launcher__RMD5afc7e3a6060d63f4_at_1421232000_2
title                    summarize tstats=t override=partial manual_rebuilds=f max_time=3600 id=DM_launcher_testdm [  search ...

The job is a regular scheduled search, run every 5 minutes on the 0 and 5 minutes ( */5 * * * * ), and searches from -1d to now. You can see the one second delay between scheduled time ("now" - latestTime) and kickoff time published.

The job lives for the default 2x period (10 minutes), gets a SID that doesn't mention acceleration, is owned by the system user, and is shared to everyone. It even has a name that you can see via REST, but isn't seen in the Jobs window, _ACCELERATE_DM_[app name]_[dm name]_ACCELERATE_.

I found it interesting that there did not seem to be any mention of the granularity of the summary being generated, other than the max_time argument to the summarize command - this must be handled automatically by the summarize command.

View solution in original post

Motivator

TL;DR: "1 Day" summary range data model acceleration found to run every 5 minutes on the 0 and 5 minutes, -1d to now.


I created a test data model testdm in the launcher app, shared it, accelerated it for 1d, then queried Splunk about it over REST:

| rest /services/search/jobs count=0 splunk_server=local 
| search label="*testdm*" NOT label="| rest *" | head 1 | transpose

Here are some interesting fields returned:

author                   splunk-system-user
defaultTTL               600
eai:acl.app              launcher
eai:acl.sharing          global
eai:acl.ttl              600
earliestTime             2015-01-13T10:40:00.000+00:00
isSavedSearch            1
label                    _ACCELERATE_DM_launcher_testdm_ACCELERATE_
latestTime               2015-01-14T10:40:00.000+00:00
published                2015-01-14T10:40:01.000+00:00
request.earliest_time    -1d
request.latest_time      now
sid                      scheduler__nobody__launcher__RMD5afc7e3a6060d63f4_at_1421232000_2
title                    summarize tstats=t override=partial manual_rebuilds=f max_time=3600 id=DM_launcher_testdm [  search ...

The job is a regular scheduled search, run every 5 minutes on the 0 and 5 minutes ( */5 * * * * ), and searches from -1d to now. You can see the one second delay between scheduled time ("now" - latestTime) and kickoff time published.

The job lives for the default 2x period (10 minutes), gets a SID that doesn't mention acceleration, is owned by the system user, and is shared to everyone. It even has a name that you can see via REST, but isn't seen in the Jobs window, _ACCELERATE_DM_[app name]_[dm name]_ACCELERATE_.

I found it interesting that there did not seem to be any mention of the granularity of the summary being generated, other than the max_time argument to the summarize command - this must be handled automatically by the summarize command.

View solution in original post

Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!