Deployment Architecture

How to extract time from bash_history # timestamp?

peterm30
Path Finder

I'm dealing with bash_history files in the following format. I would like to extract the timestamp and use that as the event timestamp, but I'm having some issues doing so.

#1579207583
whoami
#1579207584
cd /var/log
#1579207590
cat messages
#1579207595
id
#1579207598
exit

I'm using the following thread as reference: https://answers.splunk.com/answers/60015/splunking-bash-history.html

 [bash_history]
 BREAK_ONLY_BEFORE = #(?=\d+)
 MAX_TIMESTAMP_LOOKAHEAD = 11
 SHOULD_LINEMERGE = true
 TIME_FORMAT = %s
 TIME_PREFIX = #

We've changed a number of variables (set TIME_PREFIX = ^#, set MAX_TIMESTAMP_LOOKAHEAD to a higher value, etc.), but nothing seems to be working correctly.

The events do break in the correct place (#), and they do merge, so we get "groups" of events like:

#1579207583
 whoami

However, the timestamp for the event isn't set to that value. All events are set to the date/time that history was written on, so everything for any given session is the same.

That props.conf configuration -appears- correct, and our sourcetype is named bash_history (we've also tried source::/root/.bash_history, without success). I'm not sure where we are going wrong, but any suggestions would be welcome.

0 Karma
1 Solution

peterm30
Path Finder

I figured it out. The "default/props.conf" in Splunk_TA_nix contains several lines that affect the timestamp. I copied these to "local/props.conf" and unset them (didn't provide a value), and now it's working. Final props.conf looks like...

[bash_history]
BREAK_ONLY_BEFORE = #(?=\d+)
MAX_TIMESTAMP_LOOKAHEAD = 10
SHOULD_LINEMERGE = true
TIME_FORMAT = %s
TIME_PREFIX = ^#
EVENT_BREAKER_ENABLE =
DATETIME_CONFIG =

I also added a field extraction for the command itself:

^#\d+\s+(?P<command>.+) 

TL;DR - It was working from the beginning, but other values in default were affecting the final result.

View solution in original post

woodcock
Esteemed Legend

Never use the break_* settings. Try this:

[bash_history]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+[\s#]*)
TIME_PREFIX = ^
TIME_FORMAT = %s
MAX_TIMESTAMP_LOOKAHEAD = 10

But that is probably not your problem. If you are sure that your settings are correct, it must be something else. If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value. You must deploy your settings to the first full instance(s) of Splunk that handle the events (usually either the HF tier if you use one, or else your Indexer tier) UNLESS you are using HEC's JSON endpoint (it gets pre-cooked) or INDEXED_EXTRACTIONS (configs go on the UF in that case), then restart all Splunk instances there. When (re)evaluating, you must send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.

0 Karma

peterm30
Path Finder

I figured it out. The "default/props.conf" in Splunk_TA_nix contains several lines that affect the timestamp. I copied these to "local/props.conf" and unset them (didn't provide a value), and now it's working. Final props.conf looks like...

[bash_history]
BREAK_ONLY_BEFORE = #(?=\d+)
MAX_TIMESTAMP_LOOKAHEAD = 10
SHOULD_LINEMERGE = true
TIME_FORMAT = %s
TIME_PREFIX = ^#
EVENT_BREAKER_ENABLE =
DATETIME_CONFIG =

I also added a field extraction for the command itself:

^#\d+\s+(?P<command>.+) 

TL;DR - It was working from the beginning, but other values in default were affecting the final result.

PavelP
Motivator

If it is still possible to change the host configuration, I'd suggest to set the variable HISTTIMEFORMAT to '%F %T ' which will not make any time extraction work unnecessary, but also human readable. For example for CentOS you can add to /etc/profile (or some other bash config file):
HISTTIMEFORMAT='%F %T '

the bash_history looks like this:
999 2020-01-17 11:30:27 ping 192.168.1.2
1000 2020-01-17 11:30:30 history
1001 2020-01-17 11:30:40 set|grep FORMAT
1002 2020-01-17 11:30:44 man bash
1003 2020-01-17 11:31:12 export HISTTIMEFORMAT='%F %T '
1004 2020-01-17 11:31:13 history

don't miss a space before the final quote!

0 Karma

peterm30
Path Finder

That's actually exactly what's in place. However, the internal log format is always timestamped with the #epoch timestamp. The behavior is described here: https://unix.stackexchange.com/questions/214322/write-bash-history-to-a-file-with-a-timestamp

In other words, if you can the raw log, regardless of HISTTIMEFORMAT, you get #. Since Splunk is reading the raw log is what it gets.

0 Karma

to4kawa
Ultra Champion
| makeresults
| eval _raw="#1579207583
whoami
#1579207584
cd /var/log
#1579207590
cat messages
#1579207595
id
#1579207598
exit"
 `comment("this is sample you provide")`
| rex max_match=100 "(?:#)(?<time>\w+)"
| rex max_match=100 "(?m)^(?=[^#])(?<command>.+)$"
| eval tmp=mvzip(time,command)
| stats count by tmp
| eval _time=mvindex(split(tmp,","),0), command=mvindex(split(tmp,","),1)
| table _time command

If props.conf doesn't work, you can extract it with this query.

0 Karma

jarizeloyola
Path Finder

Try

 # props.conf
 [bash_history]
 # define event breaking behavior
 LINE_BREAKER = ([\r\n]+)\#\d+
 SHOULD_LINEMERGE = false

 # define time parsing behavior
 TIME_PREFIX = #
 TIME_FORMAT = %s
 MAX_TIMESTAMP_LOOKAHEAD = 12
0 Karma

peterm30
Path Finder

No luck, it appears to be line breaking at the correct place, as my original props.conf did. However, it's still not parsing the timestamp.

0 Karma

marycordova
SplunkTrust
SplunkTrust

I wonder if you replaced your entire props config as posted with just the below if this would cover both the line breaking and the timestamping? Maybe test and let me know?

[bash_history]
LINE_BREAKER = (^\#)\d+

@marycordova
0 Karma

peterm30
Path Finder

No luck, it's breaking... weird. So one event comes in as

hi this is a text
#1579273320
exit

And the previous one as:

1579273315

(the timestamp minux the #). It appears to alternate like this. Neither appears to be actually using this as the timestamp for the event though.

0 Karma

jarizeloyola
Path Finder

Where did you place your props.conf ?

0 Karma

peterm30
Path Finder

It was deployed from the deployment server within the Splunk_TA_nix app to the UF's (so /opt/splunk/etc/deployment-apps/Splunk_TA_nix/local/)

0 Karma

badrinath_itrs
Communicator

Can you check the errors and warning you are receiving for date time parsing on the receiving SPLUNK instance

0 Karma

peterm30
Path Finder

After looking in a few logs where I would expect and error to be (if there was one) I did a grep of -all- logs in /opt/splunk/var/log/splunk/ for "bash" and found nothing. Is there a specific log and/or keyword you know to check for?

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...