Is it possible to not wait until all events from Graph are pulled down before writing to index?
Had to rebuild this app due to an error on my end and although graph accepts going back 30 days, our Splunk instance keeps hanging up after some time due to what I assume is the script trying to store all events from the time frame before writing out.
A new version was just uploaded to address this use case by introducing a query limit
parameter. This parameter will limit the number of results retrieved each interval by enforcing a start and end date in the API call to Microsoft Graph. Typically, only a start date is used and the add-on recursively retrieves all new events each time it runs. By specifying a query limit
, the add-on will only query for a range of data.
Use this setting with caution as specifying a limit too small with result in no data collected. Perhaps start at a 5 day (7200 minute) query limit
.
It depends, if the input script running from your addon collects the data in batches and then sends further for indexing. I am afraid we are talking about customizing the Add-On collection mechanism.
But a quick workaround with such problem is to set the start date for your data input to very early (e.g. like a couple of days back). Now once the data is ingested for the last couple of days, change the start to 2 more days back. This would reduce the data batch size that is probably causing the memory spikes at your Splunk instance and causing it to hang.
Once the data for your last 30 days is completely in, the script would only send you the latest batch of data as per your cron schedule.
Hope this helps. Let me know.