Hello to all,
I have notice a strange behavior which indicates some kind data limit (I don't know in which side).
I'm querying the SecurityEvents of Log Analytics (which has a lot of events) and evaluate the amount by index=log_analytics source="log_analytics://*" |eval _time = _indextime | timechart count span=5m by source
When I have the delay value to 5 min and the interval >=5 min then the maximum amount of events will be approx 49k
When I decrease the interval to 3min (delay still 5) the maximum amount doubles to approx 95k and when I decrease to 1min the maximum amount goes between 130k and 150k which matches a bit with the number when I'm doing a count (via another input) to the same input for 5min/5min timeframe.
Increasing the delay to 10min and the interval to 10 also, the number returns to approx 48k.
Querying another table with probably "smaller events" in size (DHCP) I don't see this kind of limit.
So.... Any possibility to exists either to the add-on/python or the log analytics (I don't have direct access to check) some kind of data limit?
Note 1: Based on some calculations which I have done, interval less than the delay will not cause data loss (and almost sure no data duplication). Only side-effect is that an event which is close to the interval actual date/time may need 2 "cycles" (or more in some other scenario) to appear.
Update 1: With delay to 10min and interval to 5 the number is still approx. 48k
Update 2: Screenshot 😉
Unfortunately I don't have access to Azure console for these events (different team) so not aware of this but if this is also applicable for the API calls then explains everything.
Unfortunately No2, if this the case then I have to do intervals every 2 seconds to get that huge amount of events which I have or to split the sources even more which is.... very difficult to manage afterwards. 😞
Both are bad (from my tests) as they stress a lot the heavy forwarder in total with the network connections.
One another thing which I notice and which actually doesn't fully confirm the above statement is that doing the night with the amount of events from the servers in... night mode (still a lot) I can see good performance but this performance is dropped during the working hours. I was thinking that this is maybe hardware stress, network etc but I use an different instance with only one app and only one source and the results are identical. When the events are not so many they are in high performance and when (during working hours) the events are a lot the lowering of the performance are for both instances!
After a lot of tests I'm 1000% there is somewhere a data limit. I have split the inputs to many and I have 4x more data and especially in the night goes to 6x. This doesn't mean that I have solve my issue. I'm still lacking great amount of events. I'm just not able to tweak more as I don't know where is the issue.
I have even tweak queues and other things like pipelines, hardware but... nothing.
So I would say that for Azure AD collection (as it is in my case) this addon is not capable to get all the events due to some limitations.