Splunk Search

How to write the regex for transforms.conf to extract fields and assign the proper sourcetype for my sample log format?

dhavamanis
Builder

Need your help,

We have this below format of log and need to assign sourcetype to extract the fields, can you please provide the working regex to include this in transforms.conf

2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

fields:

timestamp
elb
client
backend
request_processing_time
backend_processing_time
response_processing_time
elb_status_code
backend_status_code
received_bytes
sent_bytes
request
user_agent
ssl_cipher
ssl_protocol

I have tried this, seems somehow its not working for me,

transforms.conf:

[s3-access-extractions]
REGEX = ^[[nspaces:req_time]]\s++[[nspaces:elb]]\s++[[nspaces:client]]\s++[[sbstring:backend]]\s++[[nspaces:request_processing_time]]\s++[[nspaces:backend_processing_time]]\s++[[nspaces:response_processing_time]]\s++[[nspaces:elb_status_code]]\s++[[nspaces:backend_status_code]]\s++[[nspaces:received_bytes]]\s++[[nspaces:sent_bytes]]\s++[[access-request]](?:\s++[[qstring:useragent]]\s++[[nspaces:ssl_cipher]]\s++[[nspaces:ssl_protocol]]

props.conf

[s3_access_combined]
REPORT-access = s3-access-extractions
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6NZ
EVAL-date_hour = strftime(_time,"%H")
EVAL-date_mday = strftime(_time,"%d")
EVAL-date_minute = strftime(_time,"%M")
EVAL-date_month = strftime(_time,"%m")
EVAL-date_second = strftime(_time,"%S")
EVAL-date_wday = strftime(_time,"%A")
EVAL-date_year = strftime(_time,"%Y")
category = Custom
pulldown_type = true

[rule::s3_access_combined]
sourcetype = s3_access_combined
MORE_THAN_75 = ^\S+ \S+ \S+ \S* ?\[[^\]]+\] "[^"]*" \S+ \S+ \S+ "[^"]*"$
0 Karma
1 Solution

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

View solution in original post

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

dhavamanis
Builder

Thank you so much, i have added the below in transforms.conf and its working fine,

REGEX = ^(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+[[access-request]]\s+[[qstring:useragent]]\s+(?[\S]+)\s+(?[\S]+)
0 Karma

jnussbaum_splun
Splunk Employee
Splunk Employee

Put in props.conf , with sourcetype s3_access_combined

[s3_access_combined]
EXTRACT-elb,client,backend,request_processing_time,backend_processing_time,response_processing_time,elb_status_code,backend_status_code,received_bytes,sent_bytes,request,user_agent,ssl_cipher,ssl_protocol = ^[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)\s+(?P[^ ]+)[^ \n]* (?P\d+\.\d+)\s+(?P\d+\.\d+)\s+(?P[^ ]+)[^ \n]* (?P\d+)[^ \n]* (?P\d+)[^ \n]* (?P[^ ]+)[^ \n]* "(?P[^"]+)"\s+"(?P[^"]+)[^"\n]*"\s+(?P[^ ]+)\s+(?P.+)

AnilPujar
Path Finder

what if the fields sequence changes..
2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

2015-08-07T18:59:32.388226Z pnews-api Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0)

0 Karma
Get Updates on the Splunk Community!

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...

Stay Connected: Your Guide to August Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...

Unleash the Power of Splunk MCP and AI, Meet Us at .Conf 2025, and Find Even More New ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...