Splunk Search

User keyword Lookup and Replace

gnovak
Builder

I'm trying to use lookups to do a keyword search and I can't grasp my brain around the right way to do this.

I've got some web logs I'm looking at in splunk that contain data that identifies what operating system and browser a user is using. The string that contains this data isn't always the same algorithm so my regex's haven't been succssful. I'm planning on making a chart of the most popular browsers and the most popular operating systems. I'd like to do the following as a new idea:

  1. Make a csv of all the operating systems and a csv of all the browsers.
  2. Use the lookups command to do a keyword search to locate these key words and rename them to more identifiable terms (example: Windows NT 6.1 = Windows 7).
  3. Perform a count of how many times the new identifiable term (example: Windows 7) has been found for the given period of time.

I have a simple search like this. I am looking at one particular object to get the information I need:

sourcetype=access_logs command=GET company_logo | dedup username

The type of information i get back in results is :

10.10.10.10 10.120.130.140 www.testing.somedomain.com [22/Jul/2013:19:22:08 +0000] 304 "GET /blahblah-tmf/images/company_logo.png HTTP/1.1" [booberry] (http-apr-8080-exec-3) 1 - "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"

So, I want to pipe this search to look at the lookup file, look for keywords I have listed, rename those keywords to something else and put them in a field, and then I will do a count of how many times those new renamed keywords were found. Even if I don't use the lookups command and somehow could do an automatic lookup would be cool.

My lookup file for the browser csv I started looked like:

keyword, browser_type
Trident/4.0,IE8
Trident/5.0,IE9
Trident/6.0,IE10

I checked a few other questions on this but didn't get it right just yet so figured I'd dump that here. I tried this one: http://splunk-base.splunk.com/answers/84799/find-multiple-keywords-in-file-and-show-them-on-a-chart

My search is this so far:
sourcetype=access_logs command=GET company_logo | dedup username

Any ideas?

Tags (2)
0 Karma
1 Solution

lguinn2
Legend

Before you go too far down this path, you might look at this question/answer about
IIS User Agent Extraction

There is no definitive list of possible user-agents, and no algorithm for deriving the OS and browser from the user agent. But the technology add-ons that are mentioned in IIS User Agent Extraction question are pretty good.

View solution in original post

0 Karma

lguinn2
Legend

Before you go too far down this path, you might look at this question/answer about
IIS User Agent Extraction

There is no definitive list of possible user-agents, and no algorithm for deriving the OS and browser from the user agent. But the technology add-ons that are mentioned in IIS User Agent Extraction question are pretty good.

0 Karma

gnovak
Builder

Ok i got this to work actually by extracting the entire line "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0", placing this into a field called "http_user_access" and downloading the necessary csv file for the app. (view read me). this worked. 🙂 thanks for the tips!

0 Karma

gnovak
Builder

Well the field extractor is not letting me extract that information into a field so I guess I have to do this manually.

0 Karma

lguinn2
Legend

Yes, I think that will work...

0 Karma

gnovak
Builder

oh wait are you saying i should make a field out of the entire string "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0" that i might find in the logs and then push that to the lookup file?

0 Karma

gnovak
Builder

Well the search I was using originally was sourcetype=access_logs command=GET company_logo | dedup username. I was trying to only get one count of the browser and OS a user was using when they login to the web app. I'm going to keep playing around with this a bit though.

0 Karma

lguinn2
Legend

And you could also send a message to the author of the plug-in. I am sure he would answer...

0 Karma

lguinn2
Legend

If you were using the sourcetype of access_combined or access_combined_wcookie (which are built into Splunk), you would have a field named useragent. You could set a field alias of http_user_agent and that would solve the problem.

For your sourcetype, I don't know what field you have, but it should include the entire string
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"

Again, setting a field alias would create the field name that the script expects. That would be easier than changing the script.

0 Karma

gnovak
Builder

Well the one plugin expects a field http_user_agent which I don't have. I tried maybe changing the script to look at a different field but so far no dice. It's a cool plugin though.

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...