LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
From ? Here is how those fields break down:
%h
Remote hostname (i.e., who's asking?)
%l
Remote logname (if defined)
%u
Remote user (if authenticated)
%t
Time request was received
%r
first line of the request
%>s
Final status (200, 404, etc)
%b
Size of the response in bytes
%{Referer}
How did this user get here?
%{User-Agent}
What browser is the user using?
so that we can I want to be able to filter on the first item (remote hostname) so I can weed out known scanners, but every time I try to do something with that field it gets deleted.
Can I translate the Apache log format into something splunk can handle?
Thank u a million in advance.
Hi. What I did for Apache log was to use the CustomLog
Specify the fields that you want the data to be extracted into
remotehost=%h status=%b ..
And then when your logs get ingested the fields named remotehost, status etc will have the values already. You don't have to worry about positional or extractions etc. The keyword=value pairs will get extracted by Splunk.
I appreciate your response. Would you point me to where I learn further about custom logs usage in Splunk when you can please. Meanwhile I will look further in Splunk.com, Thank u very much again for your time.
https://httpd.apache.org/docs/2.4/mod/mod_log_config.html#customlog
So you have to change the configs on Apache. The above has an example
# CustomLog with explicit format string CustomLog "logs/access_log" "%h %l %u %t \"%r\" %>s %b"
Instead you would have something like
CustomLog "logs/access_log" "%{%Y-%m-%d %H:%M:%S %z}t %{%z}t" remotehost=%h status=%b"
And then your access log would just have the date at the start of the event. The date would include the timezone (always include that) followed by remotehost=<hostname> status=<status value>
I thank u again. Are these config changes done by the Apache team or the Splunk team? I appreciate your time.
Hi. These are changes for Apache so would be done by the Apache team.
You are configuring Apache to change the log formatting.
The Apache team makes the configuration changes, restart Apache and then the logs are in a different/better format.