Splunk Search
Highlighted

How to generate a field extraction from my logs?

Path Finder

Hi ,

We need fields to be extracted from below log events, tried but facing some trouble as some of the log events are different from others.

All of these are logs from single access_log file,tried using CIM but this doesn't worked for us.

^(?:[^\]\n]*\])\s+(?P<host_apache>[^\s]+)\s+(?P<clientip>[^\s]+)\s+(?P<remoteaddr>[^\s]+)\s+(?P<forwardedfor>(\-|\d+\.\d+\.\d+\.\d+\,?\s?)+)\s(?P<trueip>[^\"]+)\"(?P<request>[^\"]+)\"\s(?P<status_new>\d+)\s(?P<bytes>[^\s]+)\s(?P<time_taken>\d+)\s\"(?P<referer>[^\"]+)\"\s\"(?P<cache_control>[^\"]+)\"\s\"(?P<user_agent>[^\"]+)\"

Below is the events and we need to extract : trueip,Method,URI,status fields from below. Can you please help us?
Highlighted and Emphasis are the fields required to change.

[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.36 196.49.49.36 ***10.19.48.71***, 57.28.75.174 57.28.75.196 ***GET*** ***/api-dual/us*** HTTP/1.1  ***200*** 2013 0.764 https://secure.chas.com/ no-cache, max-age=0 Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36  Chrome/55.0.2883.8537 Safari/7.310.4031EB0C1D8BE068A9BD600440A5D792 jcXr-RtOQKZS804wxfPLCFZ3 - Web

----------------------------
[2/Jun/2009:07:36:19 -0600] api.co.us 196.49.49.36 ***196.49.49.36*** 10.19.48.71 - ***POST*** ***/api-dual/menu/Identity*** HTTP/1.1  *200* 47 0.659 - no-cache Apache CXF 3.1.0 - - - -

-------------------------------------------------------------------------
[2/Jun/2009:07:36:19 -0600]  196.49.49.47 ***196.49.49.47*** - ***GET*** ***/**api-login/*** HTTP/1.1 ***200*** 163 0.002 - - - - - - -
------------------------------------------------------------------------------
[[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.47 ***196.49.49.47***  10.19.48.71, 57.28.75.174 57.28.75.196 ***GET*** *****/api-dual/accounts/?status=Posted&toDate=2009-06-02&fromDate=2009-06-24&channelType=Mobile&action=next*** HTTP/1.1 ?status=Posted&toDate=2009-06-24&fromDate=2016-09-24&channelType=Mobile&action=next ***200*** 13898 0.194 https://secure.com max-age=0 Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13G36 F72AEE043E08F5020E2F5A84F58774EE zKplRZORI+gYDxth6rxIj8rZ  -chs26

Thanks

0 Karma
Highlighted

Re: How to generate a field extraction from my logs?

Motivator

Since the trueip seems to be shifting in string hence it might not be easy to extract one single ip unless it occurs with some pattern (either positioning pattern in string or something which marks trueip's beginning or end) which can be extracted. However for other fields you can try below approach to extract the method, uri and status from the set of data given as sample:

your query to return the events
| rex field=_raw "(?<method>(GET|POST))\s*(?<uri>[\S]+)\s*(?<protocol>(HTTP|\w+)\/[\S]+)\s*(.*?)(?<status>[\d]+)\s\d+\s*\d+\.\d+"
| table method, uri, status

If the method can be something else besides (GET|POST) they can be inserted in the capturing group like (GET|POST|MYACTION)
See extraction here

Highlighted

Re: How to generate a field extraction from my logs?

Path Finder

Hi, Thanks . This works .Need one more help .

For status field, we have space and " -" for few events. Can you please help us including this condition in to regex?

 [[2/Jun/2009:07:36:19 -0600] secure.com 196.49.49.47 ***196.49.49.47***  10.19.48.71, 57.28.75.174 57.28.75.196 ***GET*** *****/api-dual/accounts/?status=Posted&toDate=2009-06-02&fromDate=2009-06-24&channelType=Mobile&action=next*** HTTP/1.1 ?status=Posted&toDate=2009-06-24&fromDate=2016-09-24&channelType=Mobile&action=next ***200*** - 13898 0.194 https://secure.com max-age=0 Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_5 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13G36 F72AEE043E08F5020E2F5A84F58774EE zKplRZORI+gYDxth6rxIj8rZ  -chs26

Thanks

0 Karma
Highlighted

Re: How to generate a field extraction from my logs?

Motivator

Try this please:

your query to return the events
| rex field=_raw "(?<method>(GET|POST))\s*(?<uri>[\S]+)\s*(?<protocol>(HTTP|\w+)\/[\S]+)\s*(.*?)(?<status>[\d]+)(\s|\s-\s)\d+\s*\d+\.\d+"
| table method, uri, status

Replaced \s with (\s|\s-\s)

0 Karma