All Apps and Add-ons

REST API Modular Input: How to pull data from 365 / Azure Active Directory reporting REST API?

Path Finder

I want to pull in data from the Azure reporting REST API (see https://msdn.microsoft.com/Library/Azure/Ad/Graph/howto/azure-ad-reports-and-events-preview#SignInsF...).

I have this working nicely with powershell giving it client ID, client secret, OAUTH2 login URL and endpoint URL. The output is JSON (although each event is nested so that'll need unpacking in Splunk I imagine).

I'm not really sure how to translate this over into the Splunk REST input though. I've had to manually add in the OAUTH access token - is that right?

I don't seem to get any data back into the Splunk index. I guess I'm authing ok, but not parsing the response (I've told the input to expect JSON)? Has anyone done similar?

{
  "@odata.context":"https://graph.windows.net/api/tennantname.onmicrosoft.com/reports/$metadata#signInsFromMultipleGeographiesEvents","value":[
    {
0 Karma
1 Solution

Ultra Champion

Try a custom response handler like :

class AzureJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

Add this to responsehandlers.py

Then declare this handler to be used in your setup stanza for your REST input.

alt text

View solution in original post

0 Karma

Ultra Champion

Try a custom response handler like :

class AzureJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

Add this to responsehandlers.py

Then declare this handler to be used in your setup stanza for your REST input.

alt text

View solution in original post

0 Karma

Path Finder

Thanks, I think I was approaching a similar thing.

Couple of extra Qs:

  • I don't seem to get a refresh token back from MS. The access token is only valid for 1 hour. How can I work with this?
  • MS will deliver you back now-30days of data which you can't change (argh). Is Splunk smart enough to realise there are duplicate events if I run the import every day or will I have to de-dupe somewhere?
0 Karma

Ultra Champion

If the target REST API does not provide some sort of cursoring ability , then there is not much you can do.

You could perhaps keep track of the timestamp (if such a thing exists) of the latest event indexed with some custom logic in your custom response handler and then make sure you only send an event to Splunk if it's timestamp is greater than the last timestamp.

0 Karma

Path Finder

Fair enough. There is a sort-of timestamp so I can look for just events in the last 24 hours.

Is there anyway around the OAUTH renewal problem or is this something I need to take up with MS?

0 Karma

Ultra Champion

You'll have to research how Azure is implementing OAUTH2 , then compare that with my implementation in bin/rest.py

They may not work with each other, so you may need to customize bin/rest.py , or , plugin a custom authentication handler into bin/authhandlers.py

0 Karma

Path Finder

I've fixed this I think - I created a custom handler as below and this now spits in events separately.

If anyone knows how to fix the OAUTH refresh stuff I'd be most glad!

class My365JSONArrayHandler:

     def __init__(self,**args):
         pass

     def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
         if response_type == "json":
             output = json.loads(raw_response_output)

             for entry in output['value']:
                 print_xml_stream(json.dumps(entry))
         else:
             print_xml_stream(raw_response_output)       
0 Karma

Communicator

Hi,

Did you ever find out how to fix the refresh problem?

0 Karma

Path Finder

OK, so I've made some progress by sticking in the access token I gleaned from powershell (I'll deal with the renewal token stuff later!). However the output format gives me one Splunk row for the whole import, so something like:

{
  "@odata.context":"https://graph.windows.net/api/tennantname.onmicrosoft.com/reports/$metadata#signInsFromMultipleGeographiesEvents","value":[
    {
      "firstSignInFrom":"Somewhere, GB","secondSignInFrom":"Somewhere","timeOfSecondSignIn":"2016-02-01T02:35:55Z","timeBetweenSignIns":"03:16:07","estimatedTravelHours":17,"id":"GUID","displayName":"Name","userName":"username"
    },{
      "firstSignInFrom":"Unknown Proxy","secondSignInFrom":"Somewhere","timeOfSecondSignIn":"2016-02-01T00:39:06Z","timeBetweenSignIns":"05:38:10","estimatedTravelHours":9,"id":"GUID","displayName":"Name","userName":"username"
    },{

My python is not splendid. I've tried using the JSONArrayHandler built-in but that's not so good so I'm assuming I have to do something custom.

0 Karma