Splunk Enterprise Security

HTTP 400 and 401 Error when attempting to ingest Azure AD sign-in logs

jgdixon
New Member

We have gone through several weeks of trying to setup a solution to ingest sign-in logs. After finally getting what we believe to be the proper API permissions and Subscription roles we are having mixed results with ingest.

The current API permissions are;
API / Permissions name Type

Azure Active Directory Graph (3)

Directory.Read.All Delegated
Directory.Read.All Application
User.Read Delegated

Azure Service Management (1)

user_impersonation Delegated

Microsoft Graph (4)
AuditLog.Read.All Application
SecurityActions.Read.All Delegated
SecurityEvents.Read.All Delegated
User.Read Delegated
All required Admin consents have been granted.

Some functions of Microsoft Azure Add on for Splunk are working as advertised. We are getting AD Users and Azure Security Center Tasks, so the add on is communicating with Azure.

The whole reason we set installed this add on is to retrieve Azure AD sign-in logs, and that is not going as well. Initially we were getting the HTTPError 429 for too many requests. We configured longer polling times and that error seems to have stopped. However now we get a mix of HTTPError 401unauthorized and HTTPError 400 Bad Request, specifically for sign-ins.

Sample logs are below
2020-01-22 07:55:18,899 ERROR pid=3367 tid=MainThread file=base_modinput.py:log_error:307 | Get error when collecting events.
Traceback (most recent call last):
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/ta_ms_aad/modinput_wrapper/base_modinput.py", line 127, in stream_events
self.collect_events(ew)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/MS_AAD_signins.py", line 84, in collect_events
input_module.collect_events(self, ew)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/input_module_MS_AAD_signins.py", line 77, in collect_events
sign_ins = azutils.get_items(helper, access_token, url)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/ta_azure_utils/utils.py", line 33, in get_items
raise e
HTTPError: 400 Client Error: Bad Request for url: https://graph.microsoft.com/beta/auditLogs/signIns?$orderby=createdDateTime&$filter=createdDateTime+...

2020-01-22 06:30:01,054 ERROR pid=1518 tid=MainThread file=base_modinput.py:log_error:307 | Get error when collecting events.
Traceback (most recent call last):
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/ta_ms_aad/modinput_wrapper/base_modinput.py", line 127, in stream_events
self.collect_events(ew)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/MS_AAD_signins.py", line 84, in collect_events
input_module.collect_events(self, ew)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/input_module_MS_AAD_signins.py", line 77, in collect_events
sign_ins = azutils.get_items(helper, access_token, url)
File "/opt/splunk/etc/apps/TA-MS-AAD/bin/ta_azure_utils/utils.py", line 33, in get_items
raise e
HTTPError: 401 Client Error: Unauthorized for url: https://graph.microsoft.com/beta/auditLogs/signIns?$orderby=createdDateTime&$filter=createdDateTime+...

We are looking for guidance or insight on why some aspects of the add on work while others fail.

0 Karma

gontatata
Explorer

Dear.

We are having same issue.
Looks aaad:user is no problem but another information had same (Sign in etc).
Could you let me know more details?

Best Regards.

0 Karma

jgdixon
New Member

We finally had success with our setup. It appears that Azure is limiting the amount of data it will release in a single request. If we made the request without a query limit we got HTTP 400 errors. Basically we were asking Azure to give us everything and it did not want to. We put a query limit on the requests but we went too low on the number. I misunderstood the relationship between query limit and the query frequency. What finally worked for me was to set the following parameters
Interval 180
Start Date blank
Query Limit (optional) 45
This tells Splunk to query every three minutes and request 45 minutes of logs with each pull. I found that if increased much more than 45 minutes it was very unstable.

This ran for about a day and a half until it was caught up to current logs. Once the logs being retrieved were current I set the following parameters
Interval 300
Start Date - date and time close to latest retrieved log based on createdDateTime i.e. 2020-02-08T16:00:00 from log line "createdDateTime": "2020-02-08T16:02:29.8604116Z"
Query Limit (optional) 0
This tells Splunk to query every 5 minutes for all logs but only go back to 1600, and move forward from there.

So far it has been mostly stable. There have been a couple of HTTP 400 or 429 in the last few days, but Splunk has been able to recover and re-request the logs.

0 Karma

jgdixon
New Member

We have done something that fixed our issue. We are not positive but we think the problem was related to an underpowered development Splunk instance. The quantity of logs that were being ingested were causing the server to crash. We implemented a Query limit on the ingest and logs started to flow. We are now in the process of log validation.

0 Karma

jgdixon
New Member

More confusion for what we are seeing. We have been adjusting the query pull time with the query limit time to attempt to bring in more logs. We are currently at 7 minutes between pull attempts and a 5 minute query limit. I tried disabling the query limiter to see what would happen and we immediately started getting the 400 Bad Connections from Azure. The Dev environment stayed stable during the test period, but again did not receive logs.

Additionally I am noticing the lag between created date/time for the logs and the ingest to Splunk is getting worse. We are currently averaging ~4 days of lag.

Does anyone have suggestions of how to increase the ingest rate while not causing connection errors with Azure?

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!