Getting Data In

rsyslog + apache access logs : How to parse correctly ?

gargantua
Path Finder

Dear splunkers,

I need to ingest some apaches log files.

  • Those log files are first sent to a syslog server by rsyslog

  • rsyslog adds to each line of the log file its owns information.

  • A UF is installed on this syslog server and can monitor the log file and send them to the indexers

Each line of the log file looks like this :

 

2024-02-16T00:00:00.129824+01:00 website-webserver /var/log/apache2/website/access.log 10.0.0.1 - - [16/Feb/2024:00:00:00 +0100] "GET /" 200 10701 "-" "-" 228

 

As you can see, the first part of the log, until "/access.log " had been added by rsyslog, so this is something I want Splunk to filter out / delete.

So far, I'm able to monitor the file and filter out the rsyslog layer of the events with a parameter, and I added TIME_PREFIX parameter, then Splunk automatically detects the timestamp. Like this :

 

SEDCMD-1=s/^.*\.log //g
TIME_PREFIX=- - \[

 

I created a custom sourcetype accordingly.

But the issue is that, the field extraction is not working properly. Almost no field beside the _time related fileds is being extracted.
I guess it's because I'm using a custom sourcetype, so Splunk is not extracting the fields automaticaly as it should; But I'm not really sure...

I'm a bit lost 😞

Thanks a lot for your kind help 🙂

Labels (1)
0 Karma

marnall
Motivator

You are correct in saying that Splunk no longer automatically extracts the fields with a new custom source type. Splunk does attempt to make field extractions if there are <key>=<value> patterns in the data, but that does not seem to be the case in these logs.

You could try using sed-cmd to change the logs to be formatted like apache http logs and then set the sourcetype to the standard apache http log sourcetype, then it should work. I recommend also getting the Apache Web Server app as it has knowledge objects for apache http logs. https://splunkbase.splunk.com/app/3186

0 Karma

gargantua
Path Finder

You are correct in saying that Splunk no longer automatically extracts the fields with a new custom source type

> Do you know if there is a way to activate this option ?

0 Karma

marnall
Motivator

Yes, if you get an app containing the field extractions, like the Apache Web Server app, then set the sourcetype to be the ones to which the field extractions apply to (usually listed in the app documentation or config files), then you will have field extractions.

 

The other way is to put the logs in json format but that may not work so great with these logs.

0 Karma
Get Updates on the Splunk Community!

.conf25 Registration is OPEN!

Ready. Set. Splunk! Your favorite Splunk user event is back and better than ever. Get ready for more technical ...

Detecting Cross-Channel Fraud with Splunk

This article is the final installment in our three-part series exploring fraud detection techniques using ...

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...