Splunk Search

Extracting devices that accessed my website

brownd92
New Member

Hi there,
How do I write a report which can parse a log file and let me know which devices have accessed my website.
Example line from source file:

9/17/2012 8:45:18 AM 12.23.34.45 Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko)

I need a report which will say:
iPhone 24%
Blackberry 2%
Windows 15%

I would like to define the devices like in the search field:
source="/Users/me/extendedlog.txt" iphone

Thanks in advance

Tags (1)
0 Karma

lguinn2
Legend

There is an app that provides a dynamic lookup for user agent strings; it is called TA-uas_parser. Download it from

http://apps.splunk.com/app/1007

It's free. It should help you parse out the devices.

0 Karma

kristian_kolb
Ultra Champion

Ok, so first you need to extract the fields; you can try this in the search field as a rex statement before committing it to config files.

 ... | rex "^(?:[\S]* ){4}(?<ua>.*)\s\w+$" 

That should give you the various user-agents in a field called ua. Then comes the tricky part - trying to match a particular (set of) user-agent(s) to a 'device'. The below example is one way to do this, there may be other, simpler ways - but the nature of user-agents is that they can look almost like anything. You'll have to fill out strings that will match your needs, as this just matches strings for 'MSIE 7.0', 'MSIE 8.0' and 'Safari'.

... | eval device = case(ua LIKE "%MSIE 7.0%", "IE7", ua LIKE "%MSIE 8.0%","IE8", ua LIKE "%Safari%","Apple") 

Then you can do stuff like:

 ... | top 10 device

or

... | stats c by device

Hope this helps,

Kristian

0 Karma

brownd92
New Member

Thanks Ill try that and let you know 🙂

0 Karma

kristian_kolb
Ultra Champion

edit; typo + some extra info.

0 Karma

brownd92
New Member

Hi there,
SIMAPP could be another word, but just a word not a string with spaces.

Thanks

0 Karma

kristian_kolb
Ultra Champion

So it's just the timestamp, IP, User-agent, string?

And in these cases you want to label this as IE7?

Unfortunately for you, the log seems to be whitespace separated, and the user_agent contains whitespace...

What does the string SIMAPP stand for? Is it always SIMAPP or could it be anything (including strings with spaces)?

/k

0 Karma

brownd92
New Member

9/5/2012 12:43:22 PM 84.241.141.114 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729); SIMAPP
9/5/2012 12:45:12 PM 84.241.141.114 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729); SIMAPP

0 Karma

kristian_kolb
Ultra Champion

The problem will be to determine how you want to parse the User_Agent into a 'device' - i.e. something that would make sense.

Given that User-agents differ wildly, there is no definite way to do this.

However, your logs may be 'nicer' and more predictable than the average internet-facing web server. Please provide some more sample events.

/k

0 Karma
Get Updates on the Splunk Community!

Uncovering Multi-Account Fraud with Splunk Banking Analytics

Last month, I met with a Senior Fraud Analyst at a nationally recognized bank to discuss their recent success ...

Secure Your Future: A Deep Dive into the Compliance and Security Enhancements for the ...

What has been announced?  In the blog, “Preparing your Splunk Environment for OpensSSL3,”we announced the ...

New This Month in Splunk Observability Cloud - Synthetic Monitoring updates, UI ...

This month, we’re delivering several platform, infrastructure, application and digital experience monitoring ...