I have a pile of Apache access logs where the format is just slightly modified from the default. Is there any way I can leverage Splunk's existing knowledge of the "apache-common" sourcetype to get more intelligent parsing of my slightly modified format?
Here's the original 'common' format definition:
LogFormat"%h %l %u %t \"%r\" %>s %b" common
...and here's the modified version:
LogFormat "%{X-Forwarded-For}i %h %D %l %u %t \"%r\" %>s %b" common
Basically we prepended the contents of the X-Forwarded-For header (a comma-and-space-separated list of IP addresses or "-") and then shifted around the other fields.
Clearly there's no way Splunk is going to automagically figure that out -- but I'm stumped on where to start with telling it about the new format.
So I am hoping there's some way in which I can look at what tells Splunk how to understand the default format, just as a starting point for building my new version.
Seems like this must be a basic newbie question -- any tips would be appreciated.
This is what I found, I hope it helps. It is untested but should be functional.
Reference Document-> http://httpd.apache.org/docs/1.3/logs.html
I would personally change the search names to something smaller, but I altered it slightly to name value pairs. Here is the altered query string.
"xforwarder=%{X-Forwarded-For}i IP=%h userid=%u time=%t request="%r" responseCode=%>s responseSize=%b"
I removed %D and %l as they are undefined and filler respectively.