Getting Data In

rsyslog + apache access logs : How to parse correctly ?

gargantua
Path Finder

Dear splunkers,

I need to ingest some apaches log files.

  • Those log files are first sent to a syslog server by rsyslog

  • rsyslog adds to each line of the log file its owns information.

  • A UF is installed on this syslog server and can monitor the log file and send them to the indexers

Each line of the log file looks like this :

 

2024-02-16T00:00:00.129824+01:00 website-webserver /var/log/apache2/website/access.log 10.0.0.1 - - [16/Feb/2024:00:00:00 +0100] "GET /" 200 10701 "-" "-" 228

 

As you can see, the first part of the log, until "/access.log " had been added by rsyslog, so this is something I want Splunk to filter out / delete.

So far, I'm able to monitor the file and filter out the rsyslog layer of the events with a parameter, and I added TIME_PREFIX parameter, then Splunk automatically detects the timestamp. Like this :

 

SEDCMD-1=s/^.*\.log //g
TIME_PREFIX=- - \[

 

I created a custom sourcetype accordingly.

But the issue is that, the field extraction is not working properly. Almost no field beside the _time related fileds is being extracted.
I guess it's because I'm using a custom sourcetype, so Splunk is not extracting the fields automaticaly as it should; But I'm not really sure...

I'm a bit lost 😞

Thanks a lot for your kind help 🙂

Labels (1)
0 Karma

marnall
Motivator

You are correct in saying that Splunk no longer automatically extracts the fields with a new custom source type. Splunk does attempt to make field extractions if there are <key>=<value> patterns in the data, but that does not seem to be the case in these logs.

You could try using sed-cmd to change the logs to be formatted like apache http logs and then set the sourcetype to the standard apache http log sourcetype, then it should work. I recommend also getting the Apache Web Server app as it has knowledge objects for apache http logs. https://splunkbase.splunk.com/app/3186

0 Karma

gargantua
Path Finder

You are correct in saying that Splunk no longer automatically extracts the fields with a new custom source type

> Do you know if there is a way to activate this option ?

0 Karma

marnall
Motivator

Yes, if you get an app containing the field extractions, like the Apache Web Server app, then set the sourcetype to be the ones to which the field extractions apply to (usually listed in the app documentation or config files), then you will have field extractions.

 

The other way is to put the logs in json format but that may not work so great with these logs.

0 Karma
Get Updates on the Splunk Community!

The All New Performance Insights for Splunk

Splunk gives you amazing tools to analyze system data and make business-critical decisions, react to issues, ...

Good Sourcetype Naming

When it comes to getting data in, one of the earliest decisions made is what to use as a sourcetype. Often, ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...