Splunk Search

Why is my props.conf and transforms.conf configuration not extracting fields from access_combined logs with a vhost?

lukas_loder
Communicator

Hi

I have a Problem with my Access_combined which has a vhost at the beginning like this:

www.domain.com:80 10.60.50.40 - - [04/Nov/2015:11:14:26 +0100] "GET /path/to/file/custom/flexslider.css HTTP/1.1" 200 1663 "http://www.domain.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"

When I index it, it doesn't get the fields from Access_combined.
I already tried to create a new transforms.conf and props.conf.

I'm indexing those logs with sourcetype=webserver_access_combined

Props.conf

[webserver_access_combined]
pulldown_type = true 
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = vhost-access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = \[
category = Web
description = National Center for Supercomputing Applications (NCSA) combined format HTTP web server logs (can be generated by apache or other web servers)

Transforms.conf

[vhost-access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)  
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer" 
REGEX = ^[[nspaces:vhost]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

I have those configurations on my indexer Servers. And I also see the logs with the correct sourcetype, but it doesn't work.

Does somebody have an idea why it doesn't work?

Thanks!

0 Karma

woodcock
Esteemed Legend

Your REGEX is crazy; try this one:

REGEX=^(?<vhost>\S+)\s+(?<clientip>\S+)\s++(?<ident>\S+)\s+(?<user>\S+)\s+\[(?<req_time>[^\]]+)\]\s+"(?<access_request>[^"]+)"\s+(?<status>\S+)\s+(?<bytes>\S+)\s+"(?<referrer>[^"]+)"\s+"(?<user_agent>[^"]+)"
0 Karma

hagjos43
Contributor

Did you build out your extractions and confirm them in something like regex101? I copied your example log and your extractions and it did not match. I started a bit and for the first few fields it would look more like this: \n(?\S+):(?\d+)\s(?\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s

Also you'll want your extractions to take place at search-time in your props.conf like this:

EXTRACT-blah = \n(?<vhost>\S+):(?<clientport>\d+)\s(?<clientip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s
0 Karma

lukas_loder
Communicator

I just used the the original which was in the transforms.conf like this:

REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

and tried to change this one... so this isn't the correct way?

0 Karma

hagjos43
Contributor

based on what I"m seeing that won't work. to see if your regex works do something like this:

Your Search | rex "^(?<vhost>\S+)\s+(?<clientip>\S+)\s++(?<ident>\S+)\s+(?<user>\S+)\s+\[(?<req_time>[^\]]+)\]\s+"(?<access_request>[^"]+)"\s+(?<status>\S+)\s+(?<bytes>\S+)\s+"(?<referrer>[^"]+)"\s+"(?<user_agent>[^"]+)""
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...