Getting Data In

bash_history file monitoring

minjg
Loves-to-Learn

Hi.
I'm using Splunk Enterprise 7.3.2 and installed universal forwarder 8.2.6 on Linux.
I was asked to monitor the .bash_history file, so I installed the universal forwarder and checked that data is coming into Splunk.
However, in a real-time search, most of the files are imported as well as newly added data. So monitoring is difficult because previously events are mixed with real-time events.
When I do a real-time search again, the _time field of the previously imported event and the newly added event is the same. Is it related to this?
Does anyone know how to solve this problem?

+ inputs.conf settings
[monitor:///home/*/.bash_history]
index=test
sourcetype=test_add
disabled=false
crcSalt = <SOURCE>

[monitor:///root/.bash_history]
index=test
sourcetype=test_add
disabled=false
crcSalt = <SOURCE>

Labels (2)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. Let's start from the beginning.

1. Monitoring files this way requires your forwarder to run with root permissions in order to be able to read all those files. It might be problematic with your security team and is generally not the best idea (although it sometimes can't be avoided indeed).

2. Monitoring the .bash_history files is not the very good idea for monitoring user activity. You can easily manipulate the bash history, you can turn it off completely or bypass it. There are other ways to monitor user activity (some of them are more convenient, some not, I admit). If you want to limit yourself to just bash and have a log of bash history entries you can set the option syslog_history for bash and have it log to local syslog daemon - it's still not a great and fail-safe solution but it's way better than reading each user's separate file.

3. If you want to stick with your option of reading the .bash_history files, you should make sure your events are timestamped - if environmental variable HISTTIMEFORMAT is set, bash uses its contents to format the timestamp it includes in the history file. This way you can have your entries timestamped. You should make this variable persistent across your whole environment (set it in your /etc/profile.d/). Without it the behaviour will be as you're describing - the events are not timestamped so Splunk has no way of telling when the events are from.

4. I hope you don't have too many users on your box because you might run out of file descriptors if you open to omany files.

5. Oh, and BTW, 7.x has been obsolete for some years now so it would be time to consider upgrade 😉

minjg
Loves-to-Learn

I added $export HISTTIMEFORMAT='%F %T' to /root/.bashrc instead of /etc/profile.d to test the HISTTIMEFORMAT setting.


1. However, in Splunk, timestamps and command are recognized as different events and searched.

rm -rf local/ -> event 1
#1714721901 -> event 2
cd /opt/splunkforwarder/etc/apps/ -> event 3
#1714721771 -> event 4

2. For the timestamps test, I added a setting to another Splunk's props.conf  that works well.

[test_bash_history]

BREAK_ONLY_BEFORE = #(?=\d+)
MAX_TIMESTAMP_LOOKAHEAD = 11
TIME_PREFIX = #
TIME_FORMAT = #%s

Is this setting correct?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Try to post code snippets in either a preformatted paragraph or a code block - it helps reability.

But to the point - the BREAK_ONLY_BEFORE setting is only applied when SHOULD_LINEMERGE is set to true (which generally should be avoided whenever possible).

To split your input into events containing both the timestamp and the command you'd need to adjust your LINE_BREAKER to not just treat every line as separate event but to break the input stream at new lines followed immediately by a hash and a timestamp.

It would probably be something like

LINE_BREAKER=([\r\n]+)#\d+

 

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...