Getting Data In
Highlighted

Apache error log field extraction

Builder

I can see I'm not the only person who's encountered problems extracting fields on Apache logs because those logs are so customizable. Unfortunately, that customizability means that none of the other threads regarding it have been particularly helpful for me.

I have two basic types of error logs that appear in our Apache error logs, and I need to extract fields for both types. The IFX can't deal with the complexity, and regex is NOT my strong suit, nor is anyone else in my office particularly talented with it.

Can someone help me out with the extraction regex? I'm cool with using two extractions or whatever to make it work, it doesn't HAVE to be a single extraction (single would be nice, though, since there are multiple sourcetypes that I need to apply this to).

Sample logs:

[Fri Oct 11 15:23:36 2013] [error] [client 10.65.0.70] PHP 17. smarty_function_zk_validate_field() /export/sites/client_04/data/temp/smarty_compile/theme^%%FB^FBB^FBBB2D93%%leadgen_generic.tpl.php:78, referer: //client04.ipd-las-icarus.iproduction.com/subscribe/lead_webinar.html?product_id=44742&zkConsole=1
[Fri Oct 11 15:23:36 2013] [error] [client 10.65.0.70] PHP 1. {main}() /export/sites/client_04/htdocs/subscribe/lead_webinar.html:0, referer: //client04.ipd-las-icarus.iproduction.com/subscribe/lead_webinar.html?product_id=44742&zkConsole=1
[Fri Oct 11 15:23:36 2013] [error] [client 10.65.0.70] PHP Stack trace:, referer: //client04.ipd-las-icarus.iproduction.com/subscribe/lead_webinar.html?product_id=44742&zkConsole=1
[Fri Oct 11 08:51:55 2013] [error] [client 10.110.68.254] Invalid URI in request  HTTP/1.0
[Fri Oct 11 08:51:51 2013] [error] [client 10.110.68.254] Invalid URI in request  HTTP/1.0
[Fri Oct 11 08:51:15 2013] [error] [client 10.65.0.78] [Fri Oct 11 08:51:15 2013] [ZkError:Warning] "file_exists(): open_basedir restriction in effect. File(/export/sites/client_05/htdocs/news/-218927-1.html/index.html) is not within the allowed path(s): (/app/ips/diagnostics:/app/awstats:/export/sites) in /app/ips/diagnostics/json.php at line 153" URI: ///diagnostics/json.php?v=1.0&pageinfo_url=client05.ipd-las-icarus.iproduction.com/news/-218927-1.html?pos=oopih APACHE: 10.65.0.78.1380564815002415
[Fri Oct 11 08:50:51 2013] [error] [client 10.110.68.254] Invalid URI in request  HTTP/1.0
[Fri Oct 11 08:50:42 2013] [error] [client 10.65.0.78] [Fri Oct 11 08:50:42 2013] [ZkError:Warning] "file_exists(): open_basedir restriction in effect. File(/export/sites/client_05/htdocs/multimedia/-206901-1.html/index.html) is not within the allowed path(s): (/app/ips/diagnostics:/app/awstats:/export/sites) in /app/ips/diagnostics/json.php at line 153" URI: ///diagnostics/json.php?v=1.0&pageinfo_url=client05.ipd-las-icarus.iproduction.com/multimedia/-206901-1.html APACHE: 10.65.0.78.1380564815002415
[Fri Oct 11 08:50:39 2013] [error] [client 10.65.0.78] [Fri Oct 11 08:50:39 2013] [ZkError:Notice] "Undefined index: 0 in /export/sites/client_05/bin/ips/vendor/smarty/Smarty.class.php(1950) : eval()'d code at line 75" URI: ///multimedia/-206901-1.html APACHE: 10.65.0.78.1380564815002415

Extracted fields w/ results from sample logs above:

error_time

  • Fri Oct 11 15:23:36 2013
  • Fri Oct 11 15:23:36 2013
  • Fri Oct 11 15:23:36 2013
  • Fri Oct 11 08:51:55 2013
  • Fri Oct 11 08:51:51 2013
  • Fri Oct 11 08:51:15 2013
  • Fri Oct 11 08:50:51 2013
  • Fri Oct 11 08:50:42 2013
  • Fri Oct 11 08:50:39 2013

remote_host

  • 10.65.0.70
  • 10.65.0.70
  • 10.65.0.70
  • 10.110.68.254
  • 10.110.68.254
  • 10.65.0.78
  • 10.110.68.254
  • 10.65.0.78
  • 10.65.0.78

error_type

  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • ZkError:Warning
  • (none/doesn't exist)
  • ZkError:Warning
  • ZkError:Notice

error_message

  • smartyfunctionzkvalidatefield() /export/sites/client04/data/temp/smartycompile/theme^%%FB^FBB^FBBB2D93%%leadgen_generic.tpl.php:78
  • {main}() /export/sites/client04/htdocs/subscribe/leadwebinar.html:0
  • PHP Stack trace
  • Invalid URI in request HTTP/1.0
  • Invalid URI in request HTTP/1.0
  • file_exists(): open_basedir restriction in effect. File(/export/sites/client_05/htdocs/news/-218927-1.html/index.html) is not within the allowed path(s): (/app/ips/diagnostics:/app/awstats:/export/sites) in /app/ips/diagnostics/json.php at line 153
  • Invalid URI in request HTTP/1.0
  • file_exists(): open_basedir restriction in effect. File(/export/sites/client_05/htdocs/multimedia/-206901-1.html/index.html) is not within the allowed path(s): (/app/ips/diagnostics:/app/awstats:/export/sites) in /app/ips/diagnostics/json.php at line 153
  • Undefined index: 0 in /export/sites/client_05/bin/ips/vendor/smarty/Smarty.class.php(1950) : eval()'d code at line 75

referrer (NOTE: these always start with http and a :, but I haven't posted on the Answers site yet enough for it to allow me to put URLs in my posts.)

  • //client04.ipd-las-icarus.iproduction.com/subscribe/leadwebinar.html?productid=44742&zkConsole=1
  • //client04.ipd-las-icarus.iproduction.com/subscribe/leadwebinar.html?productid=44742&zkConsole=1
  • //client04.ipd-las-icarus.iproduction.com/subscribe/leadwebinar.html?productid=44742&zkConsole=1
  • (none/doesn't exist)
  • (none/doesn't exist)
  • ///diagnostics/json.php?v=1.0&pageinfo_url=client05.ipd-las-icarus.iproduction.com/news/-218927-1.html?pos=oopih
  • (none/doesn't exist)
  • ///diagnostics/json.php?v=1.0&pageinfo_url=client05.ipd-las-icarus.iproduction.com/multimedia/-206901-1.html
  • ///multimedia/-206901-1.html

cookie

  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • (none/doesn't exist)
  • 10.65.0.78.1380564815002415
  • (none/doesn't exist)
  • 10.65.0.78.1380564815002415
  • 10.65.0.78.1380564815002415
0 Karma
Highlighted

Re: Apache error log field extraction

SplunkTrust
SplunkTrust

Try this:

props.conf
[my_apache_sourcetype]
REPORT-ext = my_apache_sourcetype_extractions

transforms.conf
[my_apache_sourcetype_extractions]
REGEX = \[([^\]]*)\]\s+\[([^\]]*)\]\s+\[client\s([^\]]*)\]\s+((\[[^\]]*\]|[\s\w/\d\.]+))?(\s+\[([^\]]*)]\s*)?("([^"]*)"\s*URI:\s*([^\s]*)\s*APACHE:\s*([^\s]*))?
FORMAT = raw_date::$1 log_level::$2 src_ip::$3 error_type::$6 error_message::$5 error_uri::$10 cookie::$11

0 Karma
Highlighted

Re: Apache error log field extraction

Builder

That almost gets me there, but is missing the referrer and cookie extraction that I need. In this extraction, "message" contains the error, the referrer, and the cookie all together.

I don't know if it helps...the lines have have a referrer and a cookie, the entire error always appears between double quotes, then the referrer and cookie come after the closing double quote.

0 Karma
Highlighted

Re: Apache error log field extraction

SplunkTrust
SplunkTrust

Try it now, I edited it.

0 Karma
Highlighted

Re: Apache error log field extraction

Builder

Still not it, that causes the second timestamp (no, I don't know why we log the timestamp twice on some errors) to appear as the errormessage and drops the rest on the floor. errormessage::$7 logs the error_type.

I'm updating my original post with a THIRD type of error being logged that I just came across, so that makes it even more complicated.

While I really like the idea of doing this at index time, I wonder if it wouldn't be easier to do an inline extraction (or multiple inline extractions)?

0 Karma
Highlighted

Re: Apache error log field extraction

Builder

Okay, so this works the way I want on the two types I had previously identified, but it doesn't work on the newly-identified PHP stack trace errors:

REGEX = \[([^\]]*)\]\s+\[([^\]]*)\]\s+\[client\s([^\]]*)\]\s+((\[[^\]]*\]|[\s\w/\d\.]+))?(\s+\[([^\]]*)]\s*)?("([^"]*)"\s*URI:\s*([^\s]*)\s*APACHE:\s*([^\s]*))?
FORMAT = error_time::$1 remote_host::$3 error_type::$6 error_message::$9 referrer::$10 cookie::$11
0 Karma