Deployment Architecture

Summary index cron schedule to populate first then schedule

sajeeshpn
New Member

Hi,

I am creating a new summary index and scheduled it to run every 6 hours intervals. In savedsearches.conf, put like:-

cron_schedule = 0 */6 * * *

With this change, only after 6 hours I can expect for some data to get populated in the summary index right.

But I would like to know whether the summary search gets executed once (now) and then gets scheduled to run every 6 hours. So that there would be some data in summary index immediately.

Thanks,
Sajeesh

Tags (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

0 Karma

sajeeshpn
New Member

Thank you !

0 Karma
Get Updates on the Splunk Community!

What’s New & Next in Splunk SOAR

Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us on ...

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

September Community Champions: A Shoutout to Our Contributors!

As we close the books on another fantastic month, we want to take a moment to celebrate the people who are the ...