I set up the Microsoft Teams Add-On For Splunk yesterday and am successfully ingesting data from our tenant. My query is regarding the relationship between the volume of incoming webhooks from Azure, and the callrecord events:
As I understand it (and this is likely the root cause 😀), Azure pushes a change notification to the Splunk webhook each time a call ends, containing the unique call ID. The Teams Call Record app/input runs on a schedule (in my case every five minutes) and retrieves all the call records it's received change notifications for since it last ran.
I would, therefore, expect there to be an equal number of m365:webhook and m365:teams:callRecord events, but there aren't. I'm typically seeing a 3:2 ratio of webhook to callRecord events.
I believe the 'id' field in the webhook event and the callRecords matches (this is the identifier splunk uses to retrieve the callRecord using graphAPI) and I would have expected the id in each event type to be unique, but there appear to be many duplicates in both event types.
If I look at my data for yesterday I can see:
4163 webhook events
3867 callRecord events
But if I dedup on 'id', I see:
2614 webhook events
2586 callRecord events
...which still doesn't match (although it's much closer) and is a lot of duplicates.
Any bright ideas, folks?
I've found an interesting specific case where there are two callRecord with the same id, both with version=1, but one is a peerToPeer call and the other is a groupCall. I think there are multiple callRecords because the initial peerToPeer call had a third participant added, escalating it to a groupCall. This could also explain some apparent duplication.
Looking at the webhook events in more detail reveals my first wrong assumption: a single call can produce multiple webhook events, with one of two changeTypes: 'created' or 'updated'. The longer the call goes on for, the more changeType:updated events are pushed to the webhook.
However, looking at callRecord events with a matching id it gets stranger. I can see 15 webhook (one 'created' and 14 'updated') events with the same id today with Splunk _time values between 10:15 and 12:15.
But there are (only) 8 matching callRecord events all with the same Splunk _time value of 07:30, startDateTime of 07:30 and endDateTime of 09:53, each with a different 'version' of 1, 2, 3, 4, 5, 8, 12 or 15, and an incrementing lastDateTimeModified value (between 10:14 and 12:12)
I thought the _time value in a splunk event showed when it was created. How can these callRecord events all have been created at 07:30, for a call that was in place between 07:30 and 09:53, and have webhook events between 10:15 and 12:15?