Hi
I tried rex extracting user agent details, however due to my lack of knowledge in Splunk finding difficultly. From the below rex command output i managed to extract ( OS, Version ).
I tried the below rex and its working fine however i dont know how to capture more details like mentioned in the below tabular column.
1 - \((?P<os>[^;]+);(?P<vers>[^;)]+).*$
2 - | rex "\(.*(?<OS>Android\s\d+|OS \d+_\d+|Windows NT\s\d+\.\d+)\;?.*\)"
| fillnull value="unrecognised" OS
3 - rex "\((?P<osinfo>[^\)]+)\)" | rex field=osinfo "(?P<os>[^;]+);(?P<vers>[^;]+)(;(?P<etc>[^;]+))?" | stats count by os, vers
I would like to extract them as below format would that be possible ?
Mobile Device | Software name | Software version | Layout Engine | OS System | OS | OS version |
A10 - SM-A105G | Chrome | 86.0.42.40.185 | Blink | Android 10 | Android | 10 |
I phone | Safari | 14 | Webkit | IOS 14.1 | IOS | 14.1 |
Desktop | Chrome 86 | 86.0.4240.111 | Blink | Windows 10 | Windows | 10 |
UserAgent has different format for iOS & Android and Desktop as we can see below,
Android user - Mozilla/5.0 (Linux; Android 10; SAMSUNG SMT590) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser / 12.1 Chrome/79.0.3945.136 Safari/537.36
Iphone user - Mozilla/5.0 (iPhone; CPU iPhone OS 14_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Mobile/15E148 Safari/604.1
Desktop user - Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36
HP device
Mozilla/5.0 (Linux; Android 5.1.1; HP Pro Slate 12 Build/LMY47V; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/68.0.3440.91 Safari/537.36
Could anyone please assist me writing a regular expression which satisfy the tabular column.
Thanks
Hi @jaibalaraman,
You have to use a regex for each cind of log and some transformations using eval to display results in the format you want when it's different from the logs (e.g. iOS 14.1 in the logs is OS 14_1).
In addition the "Layout Engine" field isn't present in the logs.
so try this regexes:
iPhone
| rex "\((?<mobile_device>\w+);\s+\w+\s+\w+\s+(?<os>\w+)\s+(?<os_version>\w+).*Version\/(?<software_version>[^ ]+)\s+\w+\/\w+\s+(?<software_name>\w+)\/\d+\.\d+$"
| replace "OS" with "iOS" in os
| replace "*_*" with "*.*" in os_version
| eval os_system=os." ".os_version
you can test regex at https://regex101.com/r/KCegdc/2
Android
there isn't also the device information
| rex "\(\w+;\s+(?<os>\w+)\s+(?<os_version>\w+);.*SamsungBrowser\s+\/\s+\d+\.\d+\s+(?<software_name>[^\/]+)\/(?<software_version>[^ ]+)"
| eval os_system=os." ".os_version
you can test regex at https://regex101.com/r/poQV2h/1
Desktop
| rex "\((?<os>\w+)\s+\w+\s+(?<os_version>[^;]+)[^\)]+\)[^\)]+\)\s+(?<software_name>[^\/]+)\/(?<software_version>[^ ]+)"
| eval os_system=os." ".os_version
you can test regex at https://regex101.com/r/chALlI/1
Ciao.
Giuseppe
Hi
i tried its not working, could you please help me fixing this issue.
thanks
Hi @jaibalaraman,
could you better describe "not working"?
no results? wrong results? an error message? what's about?
Did you tried one regex or two regexes?
Ciao.
Giuseppe
You asked a very similar question again last week here
You may need to escape the hyphens and the slashes. You should try your rex at regex101.com - you can copy all the user agent lines in and see how well your rex works against them all. You may want to try breaking up the string into parts and using other rex on just parts e.g.
| rex "(?<firstpart>[^\(]+)\((?<secondpart>[^\)]+)\)(?<thirdpart>[^\(]+)\((?<fourthpart>[^\)]+)\)(?<fifthpart>.*)"
| rex field=secondpart "(?<OS>Android|Windows|OS)"
| rex field=fifthpath "(?<browser>Safari|Chrome)"
etc, Note that not all user agent strings follow this pattern so you still may get some that fall through, but you can find those and extend your rex to cover them all eventually (until a manufacturer brings out a new phone or OS that you hadn't accounted for!). This is an ongoing activity and you might want to question the value you are getting from knowing this information!