Splunk Search

How to write the regex for transforms.conf to extract fields and assign the proper sourcetype for my sample log format?

Builder

Need your help,

We have this below format of log and need to assign sourcetype to extract the fields, can you please provide the working regex to include this in transforms.conf

2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

fields:

timestamp
elb
client
backend
requestprocessingtime
backendprocessingtime
responseprocessingtime
elbstatuscode
backendstatuscode
receivedbytes
sent
bytes
request
useragent
ssl
cipher
ssl_protocol

I have tried this, seems somehow its not working for me,

transforms.conf:

[s3-access-extractions]
REGEX = ^[[nspaces:req_time]]\s++[[nspaces:elb]]\s++[[nspaces:client]]\s++[[sbstring:backend]]\s++[[nspaces:request_processing_time]]\s++[[nspaces:backend_processing_time]]\s++[[nspaces:response_processing_time]]\s++[[nspaces:elb_status_code]]\s++[[nspaces:backend_status_code]]\s++[[nspaces:received_bytes]]\s++[[nspaces:sent_bytes]]\s++[[access-request]](?:\s++[[qstring:useragent]]\s++[[nspaces:ssl_cipher]]\s++[[nspaces:ssl_protocol]]

props.conf

[s3_access_combined]
REPORT-access = s3-access-extractions
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6NZ
EVAL-date_hour = strftime(_time,"%H")
EVAL-date_mday = strftime(_time,"%d")
EVAL-date_minute = strftime(_time,"%M")
EVAL-date_month = strftime(_time,"%m")
EVAL-date_second = strftime(_time,"%S")
EVAL-date_wday = strftime(_time,"%A")
EVAL-date_year = strftime(_time,"%Y")
category = Custom
pulldown_type = true

[rule::s3_access_combined]
sourcetype = s3_access_combined
MORE_THAN_75 = ^\S+ \S+ \S+ \S* ?\[[^\]]+\] "[^"]*" \S+ \S+ \S+ "[^"]*"$
0 Karma
1 Solution

Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

View solution in original post

Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

View solution in original post

Builder

Thank you so much, i have added the below in transforms.conf and its working fine,

REGEX = ^(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+[[access-request]]\s+[[qstring:useragent]]\s+(?[\S]+)\s+(?[\S]+)
0 Karma

Splunk Employee
Splunk Employee

Put in props.conf , with sourcetype s3accesscombined

[s3_access_combined]
EXTRACT-elb,client,backend,request_processing_time,backend_processing_time,response_processing_time,elb_status_code,backend_status_code,received_bytes,sent_bytes,request,user_agent,ssl_cipher,ssl_protocol = ^[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)\s+(?P[^ ]+)[^ \n]* (?P\d+\.\d+)\s+(?P\d+\.\d+)\s+(?P[^ ]+)[^ \n]* (?P\d+)[^ \n]* (?P\d+)[^ \n]* (?P[^ ]+)[^ \n]* "(?P[^"]+)"\s+"(?P[^"]+)[^"\n]*"\s+(?P[^ ]+)\s+(?P.+)

Path Finder

what if the fields sequence changes..
2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

2015-08-07T18:59:32.388226Z pnews-api Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0)

0 Karma