I have a small indexer cluster, single search head, and syslog-ng (all individual systems).
I'm working through the requirements for the palo alto networks app and add-on. The guide says to use a heavy forwarder, but doens't say why...why use a heavy forwarder? Also, if the syslog-ng box has a heavy forwarder installed and is indexing as well as forwarding, how much data can I expect to be indexed locally? All of it? Configurable, for example, one day's worth of data?
What should the inputs.conf file look like on the HF? There isn't an example in the setup guide. This is how I have mine set using index=main for testing. It seems to be working.
[monitor:///var/log/syslog/pan/]
index = main
sourcetype = pan:log
no_appending_timestamp = true
A couple more items that others may find useful.
For syslog-ng ; know what user the process runs as - this will set the default directory permissions (probably root). Make sure that either your HF user 'splunk' can read the log locations/files or that you set permissions in the pan.conf file. This is how the I set the perms in pan.conf.
options{
create_dirs(yes);
dir_owner("splunk");
dir_group("splunk");
dir_perm(0700);
owner("splunk");
group("splunk");
perm(0600);
};
Also for your syslog server, even though it isn't indexing data the syslog log files will still take up space and with a PA firewall this will grow quickly. I setup a script to run periodically to remove log files older than a couple of days.
For the PA app dashboards make sure to enable Data Model acceleration as described by PA in the link.
https://splunk.paloaltonetworks.com/installation.html
That is bad advice. Send to syslog-ng
, then use Universal Forwarder
(UF) NOT Heavy Forwarder
. See here:
https://www.splunk.com/blog/2016/12/12/universal-or-heavy-that-is-the-question.html
From that link Greg,
Recommendations
Only use the Heavy Forwarder when:
-Dropping a significant proportion of the data at source.
-Complex UI or addon requirements, e.g. DBconnect, Checkpoint, Cisco IPS.
-Complex (per-event) routing of the data to separate indexers or indexer clusters.
The Palo Alto TA falls into the second category.
As I have posted above, ordinarily I would agree with the UF approach, but like dB connect, Palo is a special case.
Every event in the TA is assessed for sourcetype override, this is true. But IMHO, this is best done on the Indexers, not on the HF.
But if your Indexers are already screaming in pain, then maybe got HF. But in such a case, I would get more Indexers and go UF.
I'm running a single search head, indexer cluster, and syslog-ng. What is the recommendation?
1) PA -> syslog-ng+HF -> idx_cluster
2) PA -> syslog-ng+UF -> idx_cluster
Mike
I know, it's quite a pickle, but the thing is the TA also makes API calls to fetch results.
Even if you installed the TA on the indexers you're still going to have to run a separate HF for the wilfdire are apature integrations. (you could probably get away with a stand-alone indexer with the TA, but not clustered).
I think this is one of those times where there are pro's and cons for both options. I know this TA took more config than most others to get working properly in our env (albeit an inherited mess).
Installing it per the Splunk spec simplified the process for us.
Oh, if you are not using syslog and are making API calls (again, I would go syslog-ng export), then yes, you MUST use HF. Case closed.
This explanation explains when using HF as a SYSLOG server.
If you already have a SYSLOG server, just put the UF into the SYSLOG server and transfer the logs to the indexer. There are many such cases.
Hi @mikesangray
It's generally good idea to only use Heavy Forwarders when they are running scripts/logic for retrieving data from remote APIs etc (like the AWS/Service Now/Office 365 add-on's etc) however, the bulk of the data collected by the PA apps is via syslog.
(Although there are some API requests to trigger some wildfire integration - but that's pretty lightweight)
I would agree with you, I would use the existing syslog-ng box as your HF
Where this could be complicated however, is if you have installed a Universal Forwarder on the syslog host.
The Universal Forwarder is a very lightweight Splunk implementation, and can not run TA's, so if you have a UF on that host you would want to remove it and install an HF over the top.
From a cursory view of the file system there is little difference, but the HF has the UI and python framework to configure/monitor the forwarding aspects.
Note an HF is just a Splunk enterprise install, its not a separate package.
A heavy forwarder does not do any 'indexing' as it's name suggests, it simply Forwards data to your Indexers.
(an HF can do some event pre-processing/transforming, but indexing always happens just on the indexers)
The setup instructions specifically call for a HF, but don't explain why.
"Install a heavy forwarder on each syslog-ng server"
https://docs.splunk.com/Documentation/Splunk/7.2.4/AddPANIXC/InstallUFsyslogserver
(I am the same person as OP).
You can't install the TA on a UF - It has to be an HF.
There is a danger of conflating two issues here...
The Splunk_TA_paloalto is a TA which makes (if configured) outbound connections to various APIs provided by Palo Alto. It also performs extractions, provides lookups, wokflows and other configurations.
For this reason, you have to install this on a full Splunk installation, which in this case should be a Heavy Forwarder.
Also, because it performs ingestion transformations on the data the advice is to receive the syslog data directly on the HF, and then send it to an indexer.
You COULD send syslog events to your syslog-ng server, and use a Universal Forwarder on the syslog-ng host and THEN send the data to an HF with the "Splunk_TA_paloalto" TA installed on it.
But it is just complexity, requires another host and adds no real value.
With all of the above said, and as @HiroshiSatoh comments below.
Commonly, a recommended approach is to use a Universal Forwarder on a syslog server - This would ordinarily be my recommendation too, however as a real-world user of PA I would consider this a special case.
PAs generate a LOT of syslog traffic, and there is a lot to be said for running a dedicated syslog server just for firewalls.
In this case I would suggest following the guidance from the documentation and use a Heavy Forwarder on the syslog-server and install the TA on the same system.
Good explanation thank you! I will follow the instructions and use a HF on syslog-ng.
Does the HF index the logs and store locally on the syslog box? If so, how long do you keep the data on the syslog box?
I'm following the setup guide and it refers to 'index and forward' so that's what I'm confused about.
"Use the heavy forwarder to index your data locally and to forward the data to another index."
https://docs.splunk.com/Documentation/Splunk/7.2.4/AddPANIXC/InstallUFsyslogserver
I'll get it setup and see how it goes.
Thanks.
I’ll raise that with the docs team, because that sentence is confusing.