wanted to reach out for help regarding an issue we have been experiencing on one of our customers. We build an app that exports events from a standalone customer using the Splunk Enterprise instance. We have that box gather the logs and hold them until it can be exported out of the box manually.
We used the savedseaches.conf file to schedule a search query script (export.py) to pull events. The problem is that on this particular customer he is only getting like 11 minutes worth of logs, but the file is scheduled to pull all index events from lets say 3:30pm-4:30pm, but the events start loading only from 4:19pm-4:30pm. It does this across all times consistently.
example, missing the first like 49 minutes of events:
4:19pm-430pm
5:19pm-5:30pm
6:19pm-6:30pm
We have a export.py script that goes out and gathers all index=* events according to the cron specified.
savedsearches.conf
cron_schedule = 30 */1 * * *
enablesched = 1
dispatch.ttl = 1800
allow_skew = 10m
search = | export
disable = 0
To compensate for lags, we build into the |export.py script to pull the events 1 hour prior so like. This is part of the script dealing with the specific search.
now = str(time.time()-3600).split(".")[0]
query = "search index=* earliest=" + last_scan + " lastest=" + now + "
once script is done, it creates a timestamp in a file of the now in epoch time, which is used for the next schedule time.
Any help would be appreciated
index=* OR index=_* earliest=" + last_scan + " lastest =" + now + " | append [ | mpreview index=*_metrics ] | fields - _bkt, _cd, _serial, _si
forgot to put the query currently running
index=* OR index=_* | append [ | mpreview index=*_metrics ] | fields - _bkt, _cd, _serial, _si