Deployment Architecture

Summary index cron schedule to populate first then schedule

sajeeshpn
New Member

Hi,

I am creating a new summary index and scheduled it to run every 6 hours intervals. In savedsearches.conf, put like:-

cron_schedule = 0 */6 * * *

With this change, only after 6 hours I can expect for some data to get populated in the summary index right.

But I would like to know whether the summary search gets executed once (now) and then gets scheduled to run every 6 hours. So that there would be some data in summary index immediately.

Thanks,
Sajeesh

Tags (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

0 Karma

sajeeshpn
New Member

Thank you !

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...