The Splunk for Palo Alto Networks App (https://apps.splunk.com/app/491/) wants to sourcetype the PAN logs as "pan_log" but this conflicts with what ES is requiring just "pan". Digging into the ES TA-paloalto I see a reference for pan_log and "Added per SOLNESS-2728 for compatibility with SplunkforPaloAltoNetworks app". One could modify the apps, it's your warranty. 🙂
Now as far as your questions on sourcetyping:
What is the best way to workaround this?
Since you can't change the data in Splunk once it's written to disk (index), your previous data is stuck with that sourcetype. One caveat is if you have the data you can re-index if your kick the Fishbucket for the file (http://docs.splunk.com/Documentation/Splunk/6.2.0/Troubleshooting/CommandlinetoolsforusewithSupport#btprobe).
Do I need to manually override the sourcetype name (if so, I lose access to previous data?)?
I make it a bit easy on myself since I have several other syslog data sources and use rsyslog with a fairly static configuration (see below). You will not lose the old data, it will still have the same previously written sourcetype.
(or) Will a simple "rename" fix it?
I'm not sure I know the exact answer to this. From my experience, the ES app includes the necessary props and transforms that require to be sourcetyped as "pan" or "pan_log" so the pan:traffic and pan:threat sub-sourcetyping can be performed. This is done at the input phase.
Also, where should this be done (at the indexer, or in ES)?
Preferribly applying the sourcetype is done at input time (usually a UF or whereever you're specifying the inputs.conf).
My setup:
I am using rsyslog [1] to collect my PAN firewall logs for ES and having them written to a location such as the following:
/data/splunk/logs/syslog/pan/acme-fw-01/2014-12-17_21.log
Config:
#rsyslog config snippet: /etc/rsyslog.d/splunk.conf
# Template
$template PAN,"/data/splunk/logs/syslog/pan/%fromhost%/%$year%-%$month%-%$day%_%$HOUR%.log"
# Palo Alto Networks Firewall
if $fromhost-ip == '10.0.4.1' or $fromhost-ip == '10.0.5.1' then -?PAN
& ~
Then I configure an input such as the following to force the sourcetype as "pan":
#From inputs.conf
[monitor:///data/splunk/logs/syslog/pan/]
sourcetype=pan
index=firewalls
host_segment=6
whitelist = \.log$
Hope this helps.
... View more