When you are looking access_combined from .../etc/system/default/props.conf you see that it's using transforms REPORT-access = access-extractions Then when you looking it, you see lot of recursive definitions where those parts are divided. Those are after this comments ######## access-extractions helpers start ######## # make sure to handle escaped quotes (\") inside the URI [access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]] all those [[<type>:<field name>]] are tokens which get values. Just look deeper which each <type> is, then maybe you need to look again what those contains etc. e.g access-request is [access-request]
# very relaxed regex for extracting fields from the request
REGEX = "\s*+[[reqstr:method]]?(?:\s++[[bc_uri]](?:\s++[[reqstr:version]])*)?\s*+" I contains e.g. bc_uri this is defined as [bc_uri]
# backwards compatible uri regex
# uri = path optionally followed by query [/this/path/file.js?query=part&other=var]
# path = root part followed by file [/root/part/file.part]
# Extracts: uri, uri_path, root, file, uri_query, uri_domain (optional if in proxy mode)
REGEX = (?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]]*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*))?) And so on.
... View more