Splunk Search

How to generate a field extraction from my logs?

splunker9999
Path Finder

Hi ,

We need fields to be extracted from below log events, tried but facing some trouble as some of the log events are different from others.

All of these are logs from single access_log file,tried using CIM but this doesn't worked for us.

^(?:[^\]\n]*\])\s+(?P<host_apache>[^\s]+)\s+(?P<clientip>[^\s]+)\s+(?P<remoteaddr>[^\s]+)\s+(?P<forwardedfor>(\-|\d+\.\d+\.\d+\.\d+\,?\s?)+)\s(?P<trueip>[^\"]+)\"(?P<request>[^\"]+)\"\s(?P<status_new>\d+)\s(?P<bytes>[^\s]+)\s(?P<time_taken>\d+)\s\"(?P<referer>[^\"]+)\"\s\"(?P<cache_control>[^\"]+)\"\s\"(?P<user_agent>[^\"]+)\"

Below is the events and we need to extract : trueip,Method,URI,status fields from below. Can you please help us?
Highlighted and Emphasis are the fields required to change.

[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.36 196.49.49.36 ***10.19.48.71***, 57.28.75.174 57.28.75.196 ***GET*** ***/api-dual/us*** HTTP/1.1  ***200*** 2013 0.764 https://secure.chas.com/ no-cache, max-age=0 Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36  Chrome/55.0.2883.8537 Safari/7.310.4031EB0C1D8BE068A9BD600440A5D792 jcXr-RtOQKZS804wxfPLCFZ3 - Web

----------------------------
[2/Jun/2009:07:36:19 -0600] api.co.us 196.49.49.36 ***196.49.49.36*** 10.19.48.71 - ***POST*** ***/api-dual/menu/Identity*** HTTP/1.1  *200* 47 0.659 - no-cache Apache CXF 3.1.0 - - - -

-------------------------------------------------------------------------
[2/Jun/2009:07:36:19 -0600]  196.49.49.47 ***196.49.49.47*** - ***GET*** ***/**api-login/*** HTTP/1.1 ***200*** 163 0.002 - - - - - - -
------------------------------------------------------------------------------
[[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.47 ***196.49.49.47***  10.19.48.71, 57.28.75.174 57.28.75.196 ***GET*** *****/api-dual/accounts/?status=Posted&toDate=2009-06-02&fromDate=2009-06-24&channelType=Mobile&action=next*** HTTP/1.1 ?status=Posted&toDate=2009-06-24&fromDate=2016-09-24&channelType=Mobile&action=next ***200*** 13898 0.194 https://secure.com max-age=0 Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13G36 F72AEE043E08F5020E2F5A84F58774EE zKplRZORI+gYDxth6rxIj8rZ  -chs26

Thanks

0 Karma

gokadroid
Motivator

Since the trueip seems to be shifting in string hence it might not be easy to extract one single ip unless it occurs with some pattern (either positioning pattern in string or something which marks trueip's beginning or end) which can be extracted. However for other fields you can try below approach to extract the method, uri and status from the set of data given as sample:

your query to return the events
| rex field=_raw "(?<method>(GET|POST))\s*(?<uri>[\S]+)\s*(?<protocol>(HTTP|\w+)\/[\S]+)\s*(.*?)(?<status>[\d]+)\s\d+\s*\d+\.\d+"
| table method, uri, status

If the method can be something else besides (GET|POST) they can be inserted in the capturing group like (GET|POST|MYACTION)
See extraction here

splunker9999
Path Finder

Hi, Thanks . This works .Need one more help .

For status field, we have space and " -" for few events. Can you please help us including this condition in to regex?

 [[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.47 ***196.49.49.47***  10.19.48.71, 57.28.75.174 57.28.75.196 ***GET*** *****/api-dual/accounts/?status=Posted&toDate=2009-06-02&fromDate=2009-06-24&channelType=Mobile&action=next*** HTTP/1.1 ?status=Posted&toDate=2009-06-24&fromDate=2016-09-24&channelType=Mobile&action=next ***200*** - 13898 0.194 https://secure.com max-age=0 Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13G36 F72AEE043E08F5020E2F5A84F58774EE zKplRZORI+gYDxth6rxIj8rZ  -chs26

Thanks

0 Karma

gokadroid
Motivator

Try this please:

your query to return the events
| rex field=_raw "(?<method>(GET|POST))\s*(?<uri>[\S]+)\s*(?<protocol>(HTTP|\w+)\/[\S]+)\s*(.*?)(?<status>[\d]+)(\s|\s-\s)\d+\s*\d+\.\d+"
| table method, uri, status

Replaced \s with (\s|\s-\s)

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...