I want to index our Apache error logs. There's just one nasty problem: there are multiple formats for events in the logs.
For example: PHP errors are formatted one way; Smarty errors are formatted another way. (There are at least 4 variations in our logs.) (There are 15 formats, each with 1-4 variants, in our logs.)
What I would like to do is send the entire file to a single index (e.g., Apache_error), but apply different sourcetypes based on the format of the log line.
I think I need to do something like this in inputs.conf, can someone confirm if this is the appropriate way to do it (and what the parameter I need is to specify the regex for each log line type)?
[monitor:///apache/log/location/error_log]
{some sort of regex/transforms.conf/props.conf to extract only one format of line}
index=Apache_error
source=Apache_error
sourcetype={name of sourcetype to match log line format}
{additional monitor stanzas as needed to cover all log line formats}
Or if this isn't the way to do it, what IS the right way to do it?
EDIT: Adding a sample of data (there are 15 total log line formats) and the search I wrote that takes the sample and extracts all the fields (using "rex"). It's UGLY, and I'm not sure if it will work properly with the props.conf / transforms.conf thing, because these rex commands are somewhat nested (i.e., if you don't do them in the EXACT order shown in the search - or as near to as exact as to make no nevermind - they stomp on each other, and I'm not sure how inheritance works with props.conf / transforms.conf ).
Regex is NOT my strong point, so if anyone has suggestions on how to make the rex commands better, PLEASE tell me! 🙂 NOTE: I tried every combination I could think of for the ones that are identical except for the trailing ", referer: " bit (which is 100% **optional* on every log line), and for an unknown reason, (, referer: (?<error_referer>\S+)|\n) just does NOT work for them, I ended up having to do separate rex commands for them.*
Sample (genericized) data:
TABULAR DATA NAME
Trying to open /export/sites/bondbuyer_05/data/import/tables/filename.txt
Error message at /export/sites/bondbuyer_05/bin/custom/tabular_data_converters/filexml line ###.
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
ls: *[some-string]*: No such file or directory
[Mon Jan 19 12:23:38 2015] [notice] Apache configured -- resuming normal operations
[Mon Jan 19 12:23:38 2015] [notice] Digest: done
[Mon Jan 19 12:23:38 2015] [notice] Digest: generating secret for digest authentication ...
[Mon Jan 19 12:23:38 2015] [notice] Graceful restart requested, doing restart
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] Invalid Type in request (GET or POST or Secure or garbage) (relative or absolute URI or *) Secure-HTTP/(version) 200 OK
[Mon Jan 19 12:23:38 2015] [error] [client 10.110.70.254] client sent (error message)
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] script '/export/sites/requested/script/location' not found or unable to stat, referer: http://requested.script/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] script not found or unable to stat: /export/sites/requested/script/location, referer: http://requested.script/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] File does not exist: /export/sites/requested/object/location, referer: http://requested.object/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] user not found: /relative/requested/URI
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] user username: authentication for "/relative/requested/URI": Password Mismatch, referer: http://requested.URI/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] Directory index forbidden by Options directive: /export/sites/requested/directory/location, referer: http://requested.directory/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] PHP Warning: error message in /export/sites/requested/php/location.php on line ####, referer: http://referrer.url
[Mon Jan 19 12:23:38 2015] [error] [client 10.110.70.254] request failed: error message, referer: http://referrer.url
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] client denied by server configuration: /export/sites/path/to/requested/file, referer: http://referrer.url
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] (36)File name too long: access to /relative/path/to/requested/file failed, referer: http://referrer.url
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] (13)Permission denied: access to /relative/path/to/requested/file denied, referer: http://referrer.url
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] [Mon Jan 19 12:23:38 2015] [ZkError:Type] "Error message which may include portions wrapped in "double quotes" in /absolute/URI : eval()'d code at line ####" URI: http:///relative/URI/to.referrer APACHE: (Apache.cookie.value|--unset/empty--), referer: http://requested.object/referrer
[Mon Jan 19 12:23:38 2015] [error] [client 174.35.32.146] [Mon Jan 19 12:23:38 2015] [IPS_PHP:Type] "Error message in /absolute/URI at line ###" URI: http://internal.system.domain.com/full/URL/path APACHE: --unset/empty--
Search command to extract fields from all data (I've added comments referencing line numbers from the sample data above to explain which rex goes with which data):
#Lines 1-6
rex field=_raw "(?<error_message>.*)" |
#Lines 7 and 10
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] (?<error_message>.*)" |
#Lines 8-9
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] (?<error_class>[\w\s\d]+): (?<error_message>.*)" |
#Lines 11-12
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] \[client (?<error_client>\d+\.\d+\.\d+\.\d+)\] (?<error_message>.*)" |
#Line 13
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] \[client (?<error_client>\d+\.\d+\.\d+\.\d+)\] (?<error_message>.*)(, referer: (?<error_referrer>.*))" |
#Lines 14-23
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] \[client (?<error_client>\d+\.\d+\.\d+\.\d+)\] (|\(\d+\))(?<error_class>[\w\s\d]+): (?<error_message>.*)(, referer: (?<error_referrer>.*)|\n)" |
#Line 24
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] \[client (?<error_client>\d+\.\d+\.\d+\.\d+)\] \[\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4}\] \[(?<error_class>ZkError:\S+)\] \"(?<error_message>.*)\" URI: (?<error_uri>\S+) APACHE: (?<error_apache>\S+), referer: (?<error_referrer>.*)" |
#Line 25
rex field=_raw "\[(?<error_time>\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4})\] \[(?<error_type>\w+)\] \[client (?<error_client>\d+\.\d+\.\d+\.\d+)\] \[\w{3} \w{3} \d+ \d{2}:\d{2}:\d{2} \d{4}\] \[(?<error_class>IPS_PHP:\S+)\] \"(?<error_message>.*)\" URI: (?<error_uri>\S+) APACHE: (?<error_apache>\S+)" |
table _raw, error_time, error_type, error_client, error_class, error_message, error_uri, error_apache, error_referrer
90% of our errors are in the format matching line 24.
... View more