Hi All,
There is a set of webservers we are trying to index which have many virtual hosts on them. This is simple enough to add in apache by changing the LogFormat from
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
to
LogFormat "%V %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" vcombined
However this now breaks the magic that splunk used to do for parsing apache logfiles.
So I dug into /opt/splunk/etc/system/default/transforms.conf and found these lines
[access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]
and in /opt/splunk/etc/system/default/props.conf found this
[access_combined]
pulldown_type = true
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = [
I can see I just need to add a [[nspaces:vhost]]\s to the transforms.conf entry but obviously dont want to mess with the defaults.
I tried to replicate what I saw in props.conf and transforms.conf into my own app but it just didn't seem to work????
my inputs.conf
[monitor:///etc/httpd/logs/access_log*]
sourcetype = vhost_access_combined
disabled = false
followTail = 0
host = development.server.com
index = webserver
my props.conf
[vhost_access_combined]
pulldown_type = true
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = vhost-access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = [
my transforms.conf
[vhost-access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: vhost, clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:vhost]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]
Any ideas how to get this working?
I have more complex questions to follow regarding having the host in splunk set to the value of vhost in the log entry but I will do this in baby steps first.
... View more