Getting Data In

Extract JSON device information from a long string

bbknowles
Explorer

I have okta data. One of the fields - id - contains a whole string of data which includes the browser and the app and the device. The problem is that the device is not consistently in the same location. If the user is trying to access Calendar, it might list the mobile device or the operating system (for Mac or iPhone) at the beginning of the string. Androids appear to list in parens with the version in the middle of the string.

Here are some examples:

Mac+OS+X/10.14 (18A391) CalendarAgent/416

Mozilla/5.0 (Linux; Android 8.1.0; SM-T580) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36

Mozilla/5.0 (iPhone; CPU iPhone OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1 Mobile/15E148 Safari/604.1

I'm using spath for other fields but they come in pairs. I have no idea how to parse this since the devices aren't located in a set place in the string.

Here's a look at the actual json string:

{   [-] 
     action:    {   [-] 
         categories:    [   [-] 
         Sign-in Failure    
         Suspicious Activity    
        ]   
         message:    Sign-in Failed - User is currently locked out  
         objectType:     core.user_auth.login_failed    
         requestUri:     /api/v1/authn  
    }   
     actors:    [   [-] 
        {   [-] 
         displayName:    SAFARI 
         id:     Mozilla/5.0 (iPhone; CPU iPhone OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1 Mobile/15E148 Safari/604.1    
         ipAddress:  XXX.XX.XXX.XXX 
         objectType:     Client 
        }   
    ]   
     eventId:    tevoAE1o350RMaoTKZTQFuBhQ1555012915000 
     published:  2019-04-11T20:01:55.000Z   
     requestId:  XK@dMhHceXrbWBajIF8MnQAABPI    
     sessionId: 
     targets:   [   [-] 
        {   [+] 
        }   
    ]   
}

Any advice?

0 Karma

woodcock
Esteemed Legend

Next time post the raw text. This is display-formatted json and several layers are collapsed.

0 Karma

somesoni2
Revered Legend

Parsing User Agent information is very difficult and most regular expression method are not 100% accurate. You can give this a try

https://regex101.com/r/e7kICk/1

0 Karma

bbknowles
Explorer

Wow. Thanks! This code is almost perfect. The one entry that didn't match doesn't seem to have a recognizable device.

I've never even used json before. My boss assigned the project yesterday and said it was my top priority.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...