All Apps and Add-ons

REST API Modular Input: How to pull data from 365 / Azure Active Directory reporting REST API?

alexlomas
Path Finder

I want to pull in data from the Azure reporting REST API (see https://msdn.microsoft.com/Library/Azure/Ad/Graph/howto/azure-ad-reports-and-events-preview#SignInsF...).

I have this working nicely with powershell giving it client ID, client secret, OAUTH2 login URL and endpoint URL. The output is JSON (although each event is nested so that'll need unpacking in Splunk I imagine).

I'm not really sure how to translate this over into the Splunk REST input though. I've had to manually add in the OAUTH access token - is that right?

I don't seem to get any data back into the Splunk index. I guess I'm authing ok, but not parsing the response (I've told the input to expect JSON)? Has anyone done similar?

{
  "@odata.context":"https://graph.windows.net/api/tennantname.onmicrosoft.com/reports/$metadata#signInsFromMultipleGeographiesEvents","value":[
    {
0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

Try a custom response handler like :

class AzureJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

Add this to responsehandlers.py

Then declare this handler to be used in your setup stanza for your REST input.

alt text

View solution in original post

0 Karma

Damien_Dallimor
Ultra Champion

Try a custom response handler like :

class AzureJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

Add this to responsehandlers.py

Then declare this handler to be used in your setup stanza for your REST input.

alt text

0 Karma

alexlomas
Path Finder

Thanks, I think I was approaching a similar thing.

Couple of extra Qs:

  • I don't seem to get a refresh token back from MS. The access token is only valid for 1 hour. How can I work with this?
  • MS will deliver you back now-30days of data which you can't change (argh). Is Splunk smart enough to realise there are duplicate events if I run the import every day or will I have to de-dupe somewhere?
0 Karma

Damien_Dallimor
Ultra Champion

If the target REST API does not provide some sort of cursoring ability , then there is not much you can do.

You could perhaps keep track of the timestamp (if such a thing exists) of the latest event indexed with some custom logic in your custom response handler and then make sure you only send an event to Splunk if it's timestamp is greater than the last timestamp.

0 Karma

alexlomas
Path Finder

Fair enough. There is a sort-of timestamp so I can look for just events in the last 24 hours.

Is there anyway around the OAUTH renewal problem or is this something I need to take up with MS?

0 Karma

Damien_Dallimor
Ultra Champion

You'll have to research how Azure is implementing OAUTH2 , then compare that with my implementation in bin/rest.py

They may not work with each other, so you may need to customize bin/rest.py , or , plugin a custom authentication handler into bin/authhandlers.py

0 Karma

alexlomas
Path Finder

I've fixed this I think - I created a custom handler as below and this now spits in events separately.

If anyone knows how to fix the OAUTH refresh stuff I'd be most glad!

class My365JSONArrayHandler:

     def __init__(self,**args):
         pass

     def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
         if response_type == "json":
             output = json.loads(raw_response_output)

             for entry in output['value']:
                 print_xml_stream(json.dumps(entry))
         else:
             print_xml_stream(raw_response_output)       
0 Karma

hhGA
Communicator

Hi,

Did you ever find out how to fix the refresh problem?

0 Karma

alexlomas
Path Finder

OK, so I've made some progress by sticking in the access token I gleaned from powershell (I'll deal with the renewal token stuff later!). However the output format gives me one Splunk row for the whole import, so something like:

{
  "@odata.context":"https://graph.windows.net/api/tennantname.onmicrosoft.com/reports/$metadata#signInsFromMultipleGeographiesEvents","value":[
    {
      "firstSignInFrom":"Somewhere, GB","secondSignInFrom":"Somewhere","timeOfSecondSignIn":"2016-02-01T02:35:55Z","timeBetweenSignIns":"03:16:07","estimatedTravelHours":17,"id":"GUID","displayName":"Name","userName":"username"
    },{
      "firstSignInFrom":"Unknown Proxy","secondSignInFrom":"Somewhere","timeOfSecondSignIn":"2016-02-01T00:39:06Z","timeBetweenSignIns":"05:38:10","estimatedTravelHours":9,"id":"GUID","displayName":"Name","userName":"username"
    },{

My python is not splendid. I've tried using the JSONArrayHandler built-in but that's not so good so I'm assuming I have to do something custom.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...