I'm trying to use lookups to do a keyword search and I can't grasp my brain around the right way to do this.
I've got some web logs I'm looking at in splunk that contain data that identifies what operating system and browser a user is using. The string that contains this data isn't always the same algorithm so my regex's haven't been succssful. I'm planning on making a chart of the most popular browsers and the most popular operating systems. I'd like to do the following as a new idea:
I have a simple search like this. I am looking at one particular object to get the information I need:
sourcetype=access_logs command=GET company_logo | dedup username
The type of information i get back in results is :
10.10.10.10 10.120.130.140 www.testing.somedomain.com [22/Jul/2013:19:22:08 +0000] 304 "GET /blahblah-tmf/images/company_logo.png HTTP/1.1" [booberry] (http-apr-8080-exec-3) 1 - "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"
So, I want to pipe this search to look at the lookup file, look for keywords I have listed, rename those keywords to something else and put them in a field, and then I will do a count of how many times those new renamed keywords were found. Even if I don't use the lookups command and somehow could do an automatic lookup would be cool.
My lookup file for the browser csv I started looked like:
keyword, browser_type
Trident/4.0,IE8
Trident/5.0,IE9
Trident/6.0,IE10
I checked a few other questions on this but didn't get it right just yet so figured I'd dump that here. I tried this one: http://splunk-base.splunk.com/answers/84799/find-multiple-keywords-in-file-and-show-them-on-a-chart
My search is this so far:
sourcetype=access_logs command=GET company_logo | dedup username
Any ideas?
Before you go too far down this path, you might look at this question/answer about
IIS User Agent Extraction
There is no definitive list of possible user-agents, and no algorithm for deriving the OS and browser from the user agent. But the technology add-ons that are mentioned in IIS User Agent Extraction question are pretty good.
Before you go too far down this path, you might look at this question/answer about
IIS User Agent Extraction
There is no definitive list of possible user-agents, and no algorithm for deriving the OS and browser from the user agent. But the technology add-ons that are mentioned in IIS User Agent Extraction question are pretty good.
Ok i got this to work actually by extracting the entire line "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0", placing this into a field called "http_user_access" and downloading the necessary csv file for the app. (view read me). this worked. 🙂 thanks for the tips!
Well the field extractor is not letting me extract that information into a field so I guess I have to do this manually.
Yes, I think that will work...
oh wait are you saying i should make a field out of the entire string "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0" that i might find in the logs and then push that to the lookup file?
Well the search I was using originally was sourcetype=access_logs command=GET company_logo | dedup username. I was trying to only get one count of the browser and OS a user was using when they login to the web app. I'm going to keep playing around with this a bit though.
And you could also send a message to the author of the plug-in. I am sure he would answer...
If you were using the sourcetype of access_combined
or access_combined_wcookie
(which are built into Splunk), you would have a field named useragent
. You could set a field alias of http_user_agent
and that would solve the problem.
For your sourcetype, I don't know what field you have, but it should include the entire string
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"
Again, setting a field alias would create the field name that the script expects. That would be easier than changing the script.
Well the one plugin expects a field http_user_agent which I don't have. I tried maybe changing the script to look at a different field but so far no dice. It's a cool plugin though.