I just set up the inputs for AAD using the Microsoft Azure Add-on for Splunk. One of the inputs I set up is for the AAD users, which what I assume it does is pulling all the info about AD users once and then only pulls new users info or changes to existing users.
However when I enabled this input, CPU usage on the IDM skyrocketed to 99% and the add-on was pulling all the user profiles again and again so I had to disable the input. I get no errors at all in the _internal index, I only get some INFO events every second or so saying that proxy is not enabled, which is expected since I'm not going to be using proxy.
Is this expected behaviour? Did I set up something incorrectly? When creating the input I only filled in the mandatory options and set the interval to 30s. Even when raising the interval it still pulls the same data again and I end up with duplicates.
I can't find anyone else having a similar issue, so I'm contemplating just setting the interval for something like once per day or more and just live with the duplicate data. It does seem strange though that the add-on works like that.
EDIT: Additionally to what I've said already, when I run the below search I only see checkpoints for the rest of my inputs but I don't see anything for the AAD user input.
The Azure AD user API only provides "state" data - meaning it tells you what exists. This is different than, say, Azure AD sign-in data which tells you when something happened. Unfortunately, the Azure AD user API does not have an endpoint to surface only changes, so we have to get all the user data every time. For this reason, I usually set the interval to 86400 (24 hours).