Deployment Architecture

Summary index cron schedule to populate first then schedule

sajeeshpn
New Member

Hi,

I am creating a new summary index and scheduled it to run every 6 hours intervals. In savedsearches.conf, put like:-

cron_schedule = 0 */6 * * *

With this change, only after 6 hours I can expect for some data to get populated in the summary index right.

But I would like to know whether the summary search gets executed once (now) and then gets scheduled to run every 6 hours. So that there would be some data in summary index immediately.

Thanks,
Sajeesh

Tags (1)
0 Karma
1 Solution

gcusello
Legend

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
Legend

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

0 Karma

sajeeshpn
New Member

Thank you !

0 Karma
Get Updates on the Splunk Community!

Welcome to the Future of Data Search & Exploration

You have more data coming at you than ever before. Over the next five years, the total amount of digital data ...

What’s new on Splunk Lantern in August

This month’s Splunk Lantern update gives you the low-down on all of the articles we’ve published over the past ...

This Week's Community Digest - Splunk Community Happenings [8.3.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...