Getting Data In

Why is Splunk Universal Forwarder unable to interpret checkpoint files for ausearch when running it as a Scripted Input?

miketbrand0
Explorer
  • I’m running Splunk in a Linux Red Hat environment and trying to collect logs generated by the auditd service.  I could simply put a monitor on "/var/log/audit/audit.log", but the lines of that file aren’t organized such that the records from a specific event are together, so I would end up having to correlate them on the back-end.  I’d rather use ausearch in a scripted input to correlate the records on the front end and provide clear delimiters (----) separating events before reporting them to the central server.
  • Obviously, I don’t want the entirety of all logged events being reported each time the script is run.  I just want the script to report any new event since the last run.  I found that the "--checkpoint" parameter ought to be useful for that purpose.
  • Here’s the script that I’m using:
path_to_checkpoint=$( realpath "$( dirname "$0" )/../metadata/checkpoint" )
path_to_temp_file=$( realpath "$( dirname "$0" )/../metadata/temp" )

/usr/sbin/ausearch -–input-logs --checkpoint $path_to_checkpoint > $path_to_temp_file 2>&1

output_code="$?"
chmod 777 $path_to_checkpoint

if [ "$output_code" -eq "0" ]; then
        cat $path_to_temp_file
fi

echo "" >> $path_to_temp_file
date >> $path_to_temp_file
echo "" >> $path_to_temp_file
echo $output_code >> $path_to_temp_file
  • It works just fine in the first round, when the checkpoint doesn’t exist yet and is generated for the first time, but in the second and all subsequent rounds, I get error code 10: invalid checkpoint data found in checkpoint file.
  • It works fine in all rounds when I run the bash script manually from command line, so there isn’t any kind of a syntax error, and I’m not using the parameters incorrectly.
    • Based on the fact that the first round runs without error, I know that there isn’t any kind of permissions issue with running “ausearch”.
  • It works fine in all rounds when I run the bash script as a cronjob using crontab, so the fact that Scripted Inputs run like a scheduled service isn’t the root of the problem either.
  • I’ve confirmed that the misbehavior is occurring in the interpretation of the checkpoint (rather than the generation of the checkpoint) by doing the following.
    • Trial 1:
      • First round: bash script executed manually in CMD to generate the first checkpoint
      • Second round: bash script executed manually in CMD, interpreting the old checkpoint and generating a new checkpoint
      • Result: Code 0, no errors
    • Trial 2:
      • First round: bash script executed manually in CMD to generate the first checkpoint
      • Second round: bash script executed by Splunk Forwarder as a Scripted Input, interpreting the old checkpoint and generating a new checkpoint
      • Result: Code 10, "invalid checkpoint data found in checkpoint file"
    • Trial 3:
      • First round: bash script executed by Splunk Forwarder as a Scripted Input to generate the first checkpoint
      • Second round: bash script executed manually in CMD, interpreting the old checkpoint and generating a new checkpoint
      • Result: Code 0, no errors
    • Trial 4:
      • First round: bash script executed by Splunk Forwarder as a Scripted Input to generate the first checkpoint
      • Second round: bash script executed by Splunk Forwarder as a Scripted Input, interpreting the old checkpoint and generating a new checkpoint
      • Result: Code 10, "invalid checkpoint data found in checkpoint file"
    • Inference: The error only occurs when the Splunk Forwarder Scripted Input is interpreting the checkpoint regardless of how the checkpoint was generated, therefore the interpretation is where the misbehavior is taking place.
  • I’m aware that I can include the "--start checkpoint" parameter to avoid this error by causing "ausearch" to start from the timestamp in the checkpoint file rather than look for a specific record to start from.  I’d like to avoid using that option though, because it causes the script to send duplicate records.  Any records that occurred at the timestamp recorded in the checkpoint are reported when that checkpoint was generated and also in the following execution of "ausearch".  If no events are logged by auditd between executions of "ausearch", then the same events may be reported several times until a new event does get logged.
  • I tried adding the "-i" parameter to the command hoping that it would help interpret the checkpoint file, but it didn't make any difference.
  • For reference, here's the format of the checkpoint file that is generated:
dev=0xFD00
inode=1533366
output=<hostname> 1754410692.161:65665 0x46B
  • I'm starting to wonder if it might be a line termination issue.  Like if the Splunk Universal Forwarder is expecting each line to terminate with a CRLF the way that it would in Windows, but instead it's seeing that the lines all end in LF because it's Linux.  I can't imagine why that would be the case since the version of Splunk Universal Forwarder that I have installed is meant for Linux, but that's the only thing that comes to mind.
  • I'm using version 9.4.1 of the Splunk Universal Forwarder.  The Forwarder is acting as a deployment-client that installs and runs apps issued to it by a separate deployment-server that runs Splunk Enterprise version 9.1.8.

Any thoughts on what it is about Splunk Universal Forwarder Scripted Inputs that might be preventing ausearch from interpreting its own checkpoint files?

Labels (3)
Tags (2)
0 Karma
1 Solution

miketbrand0
Explorer

I figured out the issue.  It was a permissions issue.  I needed to put splunkfwd on the appropriate access lists.  I gave splunkfwd read access to /var/log/audit/audit.log and execute access to /var/log/audit.  Now splunkfwd can execute the script either manually from command line or as a scheduled scripted input run by Splunk UF.  In both cases, the script runs without error whether there is a pre-existing checkpoint in place or not.

I understand that Splunk UF has the CAP_DAC_READ_SEARCH capability which allows it to read files it normally wouldn't have access to.  What I don't understand is why that capability worked fine when I asked it to generate the initial checkpoint, but then suddenly stopped working the moment that I asked it to use a pre-existing checkpoint. 

Is it possible that the CAP_DAC_READ_SEARCH capability doesn't extend to reading the inode properties of each file?  If that were the case, it would explain why the initial ausearch went fine (when inode doesn't matter because ausearch is just ingesting all of the audit.log files regardless of inode), but then when ausearch needs to look for the specific audit.log file that matches the inode listed in the checkpoint file, it can't do so.

 

Thank you to @PickleRick and @isoutamo for your suggestions and assistance.  I couldn't have done it without you both.

View solution in original post

miketbrand0
Explorer

I figured out the issue.  It was a permissions issue.  I needed to put splunkfwd on the appropriate access lists.  I gave splunkfwd read access to /var/log/audit/audit.log and execute access to /var/log/audit.  Now splunkfwd can execute the script either manually from command line or as a scheduled scripted input run by Splunk UF.  In both cases, the script runs without error whether there is a pre-existing checkpoint in place or not.

I understand that Splunk UF has the CAP_DAC_READ_SEARCH capability which allows it to read files it normally wouldn't have access to.  What I don't understand is why that capability worked fine when I asked it to generate the initial checkpoint, but then suddenly stopped working the moment that I asked it to use a pre-existing checkpoint. 

Is it possible that the CAP_DAC_READ_SEARCH capability doesn't extend to reading the inode properties of each file?  If that were the case, it would explain why the initial ausearch went fine (when inode doesn't matter because ausearch is just ingesting all of the audit.log files regardless of inode), but then when ausearch needs to look for the specific audit.log file that matches the inode listed in the checkpoint file, it can't do so.

 

Thank you to @PickleRick and @isoutamo for your suggestions and assistance.  I couldn't have done it without you both.

miketbrand0
Explorer

.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's an interesting case because generally the UF should have nothing to do with how the ausearch operates. It just spawns a child process, runs the script and whether ausearch does something successfully or not is really its own responsibility.

What I would try in case there is a difference - do a dump of environment variables and compare the environment from when you're spawning your script as an input with the one you're getting when the script is run by hand.

miketbrand0
Explorer

I'm not sure how environment variables would factor in considering that none of them are being used in my script, and all file paths are fully elaborated, but here it goes:

Environment variables when running the script manually in CMD:

LS_COLORS=rs=0:di=38;5;33:ln=38;5;51: ... (I hope I don't have to elaborate all of this)
LANG=en_US.UTF-8
SUDO_GID=1000
HOSTNAME=<deployment_client_name>
SUDO_COMMAND=/bin/bash /opt/splunkforwarder/etc/apps/<app_name>/bin/audit_log_retreiver.sh
USER=root
PWD=/home/miketbrand0
HOME=/root
SUDO_USER=miketbrand0
SUDO_UID=1000
MAIL=/var/spool/mail/miketbrand0
SHELL=/bin/bash
TERM=xterm-256color
SHLVL=1
LOGNAME=root
PATH=/sbin:/bin:/usr/sbin:/usr/bin
HISTSIZE=1000
_=/bin/printenv

 

Environment variables when running the script using the Splunk Universal Forwarder:

LD_LIBRARY_PATH=/opt/splunkforwarder/lib
LANG=en_US.UTF-8
TZ=:/etc/localtime
OPENSSL_CONF=/opt/splunkforwarder/openssl/openssl.cnf
HOSTNAME=<deployment_client_name>
INVOCATION_ID=bdfc92da21b4sdb0a759a5997d9a85
USER=splunkfwd
SPLUNK_HOME=/opt/splunkforwarder
PYTHONHTTPSVERIFY=0
PWD=/
HOME=/opt/splunkforwarder
PYTHONUTF8=1
JOURNAL_STREAM=9:4867979
SSL_CERT_FILE=/opt/splunkforwarder/openssl/cert.pem
SPLUNK_OS_USER=splunkfwd
SPLUNK_ETC=/opt/splunkforwarder/etc
LDAPCONF=/opt/splunkforwarder/etc/openldap/ldap.conf
SHELL=/bin/bash
SPLUNK_SERVER_NAME=SplunkForwarder
OPENSSL_FIPS=1
SPLUNK_DB=/opt/splunkforwarder/var/lib/splunk
ENABLE_CPUSHARES=true
SHLVL=2
LOGNAME=splunkfwd
PATH=/opt/splunkforwarder/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
_=/usr/bin/printenv

 

Environment variables when running as a cron job:

LANG=en_US.UTF-8
XDG_SESSION_ID=4157
USER=root
PWD=/root
HOME=/root
SHELL=/bin/sh
SHLVL=2
LOGNAME=root
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/0/bus
XDG_RUNTIME_DIR=/run/user/0
PATH=/usr/bin:/bin
_=/usr/bin/printenv

 

Nothing stands out to me as something that manual CMD and the cron job have in common that Splunk UF does differently that would impact the functionality of ausearch in the script.  Do you see anything that I might be missing?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

While you are not directly using env vars they might influence behaviour of spawned processes. In your case the possibly important main difference is the LD_LIBRARY_PATH env. Set this variable in your interactive shell session to the forwarder's value and try running ausearch.

isoutamo
SplunkTrust
SplunkTrust

There are some important differences e.g. USER are different. When running from UF it’s splunkfwd and other cases its root. Have you try to run it commandline as user splunkfwd not with root or sudo root? Also at least libraries have different order to use.

It’s easier to check different when you sort those env variables before looking. Also it’s easy to cat those into same sort | uniq -c to check are there something which are missing or different in separate runs.

miketbrand0
Explorer

Oddly enough, that seems to raise as many questions as it answers.

When I run the script manually from CMD while sudo'ed as splunkfwd, I get an error message indicating that splunkfwd doesn't have permission to access /var/log/audit/audit.log nor to access /etc/audit/audit.conf.  It isn't clear to me we why it is that when splunkfwd is being used by Splunk UF to executed scripted inputs, it seems to have the necessary permissions to perform "ausearch" and run to completion without errors (at least in the firs round when no checkpoint exists yet), but when I try to execute the same script as the same user manually from CMD, suddenly, I don't have the necessary permissions.

Here are the environment variables that were in place during the execution of the script:

_=/bin/printenv
HISTSIZE=1000
HOME=/opt/splunkforwarder
HOSTNAME=<deployment_client_name>
LANG=en_US.UTF-8
LOGNAME=splunkfwd
LS_COLORS=rs=0:di=38;5;33:ln=38;5;51: ... etc
MAIL=/var/spool/mail/miketbrand0
PATH=/sbin:/bin:/usr/sbin:/usr/bin
PWD=/home/miketbrand0
SHELL=/bin/bash
SHLVL=1
SUDO_COMMAND=/bin/bash /opt/splunkforwarder/etc/apps/<app_name>/bin/audit_log_retreiver.sh
SUDO_GID=0
SUDO_UID=0
SUDO_USER=root
TERM=xterm-256color
USER=splunkfwd

When I got the permission denied error, ausearch exited with an exit code of 1, indicating that there were no matches found in the search results (which is a bit disingenuous because it never actually got to look for matches), but after I ran the script as root once and then re-owned the checkpoint file to belong to splunkfwd, I tried running the script as splunkfwd again.  This time ausearch yielded an exit code of 10 which is consistent with what I have observed when Splunk UF executes the script.

I think that means that whatever problem is causing ausearch to interpret the checkpoint as corrupted lies if the splunkfwd user, and not with Splunk UF.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

If you run a fairly modern UF by means of systemd unit, it should get a CAP_DAC_READ_SEARCH capability which allows it to read files it normally wouldn't have access to (without it you would need to do heavy file permissions magic to ingest logs).

If you simply su/sudo to the splunkfwd user you don't have those capabilities.

 

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...