Hi all
After installing Splunk_TA_nix with no local/inputs on heavy forwarders the error I was seeing in this post went away. So that one was actually solved.
However, the issue with missing linebreaks in the output mentionen by @PickleRick remains.
"1) Breaks the whole lastlog output into separate events on the default LINE_BREAKER (which means every line is treated as separate event)"
So I thought I'd see if I could get that one confirmed and/or fixed as well
When searching for "source=lastlog" right now I get get a list of events from each host like so:
> user2 10.0.0.1 Wed Oct 30 11:20
> another_user 10.0.0.1 Wed Oct 30 11:21
> discovery 10.0.0.2 Tue Oct 29 22:19
> scanner 10.0.0.3 Mon Oct 28 21:39
> admin_user 10.0.0.4 Mon Oct 21 11:19
> root 10.0.0.1 Tue Oct 1 08:57
Before placing the TA on the HFs I would see output only containing the header
> USERNAME FROM LATEST
Which is completely useless 😄
After adding the TA to the HFs this "header" line is no longer present, at all, in any events from any server. While Field names are correct and fully searchable with IP adresses, usernames etc.
My question at this point is probably best formulated as "am I alright now"? 😁
Based on the feedback in the previous post I was sort of assuming that the expected output/events should be the same as the screen output when running the script locally, i.e. one event with the entire output, like so
USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57
While I can see this as being easier on the eyes and easier to interpret when found, it could make processing individual filed:value pairs more problematic in searches.
So what I am wondering, is everything "OK" now? Or am I still getting events with incorrect linebreaks? I don't know what the expected/correct output should be.
Best regards
Hi @fatsug,
Adding to what was previously discussed, you can break down the behavior and where it occurs by merging $SPLUNK_HOME/etc/system/default/props.conf and $SPLUNK_HOME/etc/apps/Splunk_TA_nix/default/props.conf. @PickleRick's use of btool in the last question hints at how to do this:
$SPLUNK_HOME/bin/splunk btool --debug props list lastlog
The source type will also inherit [default] settings provided by any app and any additional [lastlog] settings you may have.
Looking at the settings relevant to the question:
[lastlog]
KV_MODE = multi
LINE_BREAKER = ^((?!))$
SHOULD_LINEMERGE = false
TRUNCATE = 1000000
The LINE_BREAKER, SHOULD_LINEMERGE, and TRUNCATE settings tell Splunk the boundaries of the event. These settings are used by the linebreaker and aggregator on a heavy forwarder or indexer; they are not, except under specific conditions, used by a universal forwarder. The SHOULD_LINEMERGE setting disables reassembly of multiple lines (delineated by the LINE_BREAKER setting) into a single event; in this case, the LINE_BREAKER setting does that work for us more efficiently.
As @PickleRick noted, the regular expression ^((?!))$ matches nothing. When lastlog.sh is executed, its entire output up to TRUNCATE = 1000000 bytes (~1 MB) is indexed as one event:
USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57
If Splunk_TA_nix is not installed on the search head, then the sample above is what you would see as a single event in your search results.
What happens if the lastlog.sh output is longer than 1,000,000 bytes? All data after the 1,000,000th byte is simply truncated and lost. For example:
USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57
... ~1,000,000 bytes of data
jsmit
If the "t" in "jsmit" is the 1,000,000th byte, then that's the last byte indexed, and everything after "jsmit" is truncated.
If Splunk_TA_nix is installed on the search head, then the KV_MODE = multi setting tells Splunk to pipe the events through the multikv command before returning the results. On disk, there is only one event, and that event has been split into six separate events using the first line for field names.
If Splunk_TA_nix is not installed on the search head, you can include the multikv command in your search to produce the same results:
index=main sourcetype=lastlog
| multikv
You can also test multikv directly using sample data:
| makeresults
| eval _raw="USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57"
| multikv
If you have Splunk_TA_nix installed on your forwarders, your heavy forwarders, your indexers, and your search heads, then everything is "OK." Technically, your heavy forwarders are doing the parsing work instead of your indexers; however, if your indexers are Linux, you probably want to run lastlog.sh on them, too.
Eventually, though, you'll realize that auditd, /var/log/auth.log, /var/log/secure, etc. are better sources of login data, although /var/log/wtmp and lastlog.sh are useful if you want regular snapshots of login times relative to wtmp retention on your hosts. You may find that log rotation for wtmp is either misconfigured or not configured at all on older Linux hosts.