Getting Data In

host name extraction / regex from syslog using Rsyslog or Splunk

mlody11
Engager

Hey all, I just wanted to get people's opinion on the best method for getting firewall data into Splunk. We have firewall logs coming via syslog. We are using Rsyslog and its working fine. The data for the firewall is coming into a central point which then forwards it to our heavy forwarders.

So the data path looks like this (firewall) > (firewall log collection node) > (load balancer) > (HF) > Indexer

The catch to all of it is, the host coming into Splunk was the firewall log collection node instead of the firewall itself. To get the host name of the firewall, we can extract that from the message. The question is, where is it better to extract that?

The messages look like this:

 

blah blah blah originsicname=CN\=THIS_IS_THE_HOSTNAME,O\=somethingelse sequencenum=3291 some more blah blah blah

 

 

Option 1: Let rsyslog do it.

The messages come in and we have a regex routine in rsyslog that extracts the host from the logs and places it in a folder path that contains the host. The template and rsyslog script is below.

 

template(name="checkpoint_host_extrated-dynaFile" type="string" string="/var/log/syslog/%$MYHOSTNAME%/checkpoint_firewall_514/%!extracted_firewall_hostname%/%$YEAR%-%$MONTH%-%$DAY%-%$HOUR%.log")
template(name="firewall_host_extraction_originsicname" type="string" string="%msg:R,ERE,1,FIELD:originsicname=...=(.+),O--end%")

 

 

if $rawmsg contains ["originsicname=CN"] then { 
  reset $!extracted_firewall_hostname = exec_template("firewall_host_extraction_originsicname");
  action(name="checkpoint_firewall_514-write" type="omfile" DynaFile="checkpoint_host_extrated-dynaFile" template="rawmsg_format" dynaFileCacheSize="5" closeTimeout="5" ioBufferSize="64k" fileOwner="splunk" dirOwner="splunk" dirGroup="splunk" fileGroup="splunk" fileCreateMode="0755" dirCreateMode="0755")}

 

This works great, host name is recorded properly, etc. I may still need to do some error correction in case it doesn't get a match. Some documentation regarding this:

https://www.rsyslog.com/doc/master/configuration/nomatch.html

https://www.rsyslog.com/regex/

https://www.rsyslog.com/doc/v8-stable/configuration/property_replacer.html

https://www.rsyslog.com/how-to-use-set-variable-and-exec_template/

 

Option 2: have Splunk do the extraction when the heavy forwarder reads the log file.

I haven't written this but if this is more efficient, I could put the effort into it.


Thoughts? (FYI, I added all that info above for reference in case anyone else needs to do a regex extraction of a field via rsyslog / syslog).

Labels (3)
0 Karma

venkatasri
SplunkTrust
SplunkTrust

@mlody11 

Option 1 is efficient less overhead on parsing/other-queues and as you mentioned if your _raw event's doesn't contain firewall hostname in some events option 2 is not a solution for your case.

---

An upvote would be appreciated if it helps!

venkatasri
SplunkTrust
SplunkTrust

Hi @mlody11 

You can go with Option 1:

if you are able to extract and write the source firewall_hostname in absolute path of log file and you shall be running Splunk UF on the host where *.log files being written. Then use host_segment setting in inputs.conf to override the default host field.

---

An upvote would be appreciated if it helps!

Tags (2)
0 Karma

mlody11
Engager

Yup, that's exactly what I'm doing, overwriting the host name from the log path that is created by extracting it from the message using rsyslog.

The question really is, which option is more efficient?

Also, I guess I should mention some of the logs have it, some of them dont, so extracting only from the ones that have it is also something to consider. 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2026-2027 SplunkTrust is officially open. If ...