Possible lineabreaker issue with lastlog in Splunk...

fatsug · ‎10-30-2024

Hi all

After installing Splunk_TA_nix with no local/inputs on heavy forwarders the error I was seeing in this post went away. So that one was actually solved.

However, the issue with missing linebreaks in the output mentionen by @PickleRick remains.

"1) Breaks the whole lastlog output into separate events on the default LINE_BREAKER (which means every line is treated as separate event)"

So I thought I'd see if I could get that one confirmed and/or fixed as well

When searching for "source=lastlog" right now I get get a list of events from each host like so:

> user2                      10.0.0.1                   Wed Oct 30 11:20
> another_user               10.0.0.1                   Wed Oct 30 11:21
> discovery                  10.0.0.2                   Tue Oct 29 22:19
> scanner                    10.0.0.3                   Mon Oct 28 21:39
> admin_user                 10.0.0.4                   Mon Oct 21 11:19
> root                       10.0.0.1                   Tue Oct 1 08:57

Before placing the TA on the HFs I would see output only containing the header

> USERNAME              FROM                  LATEST

Which is completely useless 😄

After adding the TA to the HFs this "header" line is no longer present, at all, in any events from any server. While Field names are correct and fully searchable with IP adresses, usernames etc.

My question at this point is probably best formulated as "am I alright now"? 😁

Based on the feedback in the previous post I was sort of assuming that the expected output/events should be the same as the screen output when running the script locally, i.e. one event with the entire output, like so

USERNAME                   FROM                       LATEST
user2                      10.0.0.1                   Wed Oct 30 11:20
another_user               10.0.0.1                   Wed Oct 30 11:21
discovery                  10.0.0.2                   Tue Oct 29 22:19
scanner                    10.0.0.3                   Mon Oct 28 21:39
admin_user                 10.0.0.4                   Mon Oct 21 11:19
root                       10.0.0.1                   Tue Oct 1 08:57

While I can see this as being easier on the eyes and easier to interpret when found, it could make processing individual filed:value pairs more problematic in searches.

So what I am wondering, is everything "OK" now? Or am I still getting events with incorrect linebreaks? I don't know what the expected/correct output should be.

Best regards

tscroggins · ‎11-24-2024

Hi @fatsug,

Adding to what was previously discussed, you can break down the behavior and where it occurs by merging $SPLUNK_HOME/etc/system/default/props.conf and $SPLUNK_HOME/etc/apps/Splunk_TA_nix/default/props.conf. @PickleRick's use of btool in the last question hints at how to do this:

$SPLUNK_HOME/bin/splunk btool --debug props list lastlog

The source type will also inherit [default] settings provided by any app and any additional [lastlog] settings you may have.

Looking at the settings relevant to the question:

[lastlog]
KV_MODE = multi
LINE_BREAKER = ^((?!))$
SHOULD_LINEMERGE = false
TRUNCATE = 1000000

The LINE_BREAKER, SHOULD_LINEMERGE, and TRUNCATE settings tell Splunk the boundaries of the event. These settings are used by the linebreaker and aggregator on a heavy forwarder or indexer; they are not, except under specific conditions, used by a universal forwarder. The SHOULD_LINEMERGE setting disables reassembly of multiple lines (delineated by the LINE_BREAKER setting) into a single event; in this case, the LINE_BREAKER setting does that work for us more efficiently.

As @PickleRick noted, the regular expression ^((?!))$ matches nothing. When lastlog.sh is executed, its entire output up to TRUNCATE = 1000000 bytes (~1 MB) is indexed as one event:

USERNAME                        FROM                            LATEST
user2                           10.0.0.1                        Wed Oct 30 11:20
another_user                    10.0.0.1                        Wed Oct 30 11:21
discovery                       10.0.0.2                        Tue Oct 29 22:19
scanner                         10.0.0.3                        Mon Oct 28 21:39
admin_user                      10.0.0.4                        Mon Oct 21 11:19
root                            10.0.0.1                        Tue Oct 1 08:57

If Splunk_TA_nix is not installed on the search head, then the sample above is what you would see as a single event in your search results.

What happens if the lastlog.sh output is longer than 1,000,000 bytes? All data after the 1,000,000th byte is simply truncated and lost. For example:

USERNAME                        FROM                            LATEST
user2                           10.0.0.1                        Wed Oct 30 11:20
another_user                    10.0.0.1                        Wed Oct 30 11:21
discovery                       10.0.0.2                        Tue Oct 29 22:19
scanner                         10.0.0.3                        Mon Oct 28 21:39
admin_user                      10.0.0.4                        Mon Oct 21 11:19
root                            10.0.0.1                        Tue Oct 1 08:57
... ~1,000,000 bytes of data
jsmit

If the "t" in "jsmit" is the 1,000,000th byte, then that's the last byte indexed, and everything after "jsmit" is truncated.

If Splunk_TA_nix is installed on the search head, then the KV_MODE = multi setting tells Splunk to pipe the events through the multikv command before returning the results. On disk, there is only one event, and that event has been split into six separate events using the first line for field names.

If Splunk_TA_nix is not installed on the search head, you can include the multikv command in your search to produce the same results:

index=main sourcetype=lastlog
| multikv

You can also test multikv directly using sample data:

| makeresults 
| eval _raw="USERNAME                        FROM                            LATEST
user2                           10.0.0.1                        Wed Oct 30 11:20
another_user                    10.0.0.1                        Wed Oct 30 11:21
discovery                       10.0.0.2                        Tue Oct 29 22:19
scanner                         10.0.0.3                        Mon Oct 28 21:39
admin_user                      10.0.0.4                        Mon Oct 21 11:19
root                            10.0.0.1                        Tue Oct 1 08:57"
| multikv

If you have Splunk_TA_nix installed on your forwarders, your heavy forwarders, your indexers, and your search heads, then everything is "OK." Technically, your heavy forwarders are doing the parsing work instead of your indexers; however, if your indexers are Linux, you probably want to run lastlog.sh on them, too.

Eventually, though, you'll realize that auditd, /var/log/auth.log, /var/log/secure, etc. are better sources of login data, although /var/log/wtmp and lastlog.sh are useful if you want regular snapshots of login times relative to wtmp retention on your hosts. You may find that log rotation for wtmp is either misconfigured or not configured at all on older Linux hosts.

Possible lineabreaker issue with lastlog in Splunk_TA_nix

heavy forwarder

universal forwarder

Can’t make it to .conf25? Join us online!

Community Content Calendar, September edition

Splunkbase Unveils New App Listing Management Public Preview

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you a member of the Splunk Community?