Hi @fatsug, Adding to what was previously discussed, you can break down the behavior and where it occurs by merging $SPLUNK_HOME/etc/system/default/props.conf and $SPLUNK_HOME/etc/apps/Splunk_TA_nix/default/props.conf. @PickleRick's use of btool in the last question hints at how to do this: $SPLUNK_HOME/bin/splunk btool --debug props list lastlog The source type will also inherit [default] settings provided by any app and any additional [lastlog] settings you may have. Looking at the settings relevant to the question: [lastlog]
KV_MODE = multi
LINE_BREAKER = ^((?!))$
SHOULD_LINEMERGE = false
TRUNCATE = 1000000 The LINE_BREAKER, SHOULD_LINEMERGE, and TRUNCATE settings tell Splunk the boundaries of the event. These settings are used by the linebreaker and aggregator on a heavy forwarder or indexer; they are not, except under specific conditions, used by a universal forwarder. The SHOULD_LINEMERGE setting disables reassembly of multiple lines (delineated by the LINE_BREAKER setting) into a single event; in this case, the LINE_BREAKER setting does that work for us more efficiently. As @PickleRick noted, the regular expression ^((?!))$ matches nothing. When lastlog.sh is executed, its entire output up to TRUNCATE = 1000000 bytes (~1 MB) is indexed as one event: USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57 If Splunk_TA_nix is not installed on the search head, then the sample above is what you would see as a single event in your search results. What happens if the lastlog.sh output is longer than 1,000,000 bytes? All data after the 1,000,000th byte is simply truncated and lost. For example: USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57
... ~1,000,000 bytes of data
jsmit If the "t" in "jsmit" is the 1,000,000th byte, then that's the last byte indexed, and everything after "jsmit" is truncated. If Splunk_TA_nix is installed on the search head, then the KV_MODE = multi setting tells Splunk to pipe the events through the multikv command before returning the results. On disk, there is only one event, and that event has been split into six separate events using the first line for field names. If Splunk_TA_nix is not installed on the search head, you can include the multikv command in your search to produce the same results: index=main sourcetype=lastlog
| multikv You can also test multikv directly using sample data: | makeresults
| eval _raw="USERNAME FROM LATEST
user2 10.0.0.1 Wed Oct 30 11:20
another_user 10.0.0.1 Wed Oct 30 11:21
discovery 10.0.0.2 Tue Oct 29 22:19
scanner 10.0.0.3 Mon Oct 28 21:39
admin_user 10.0.0.4 Mon Oct 21 11:19
root 10.0.0.1 Tue Oct 1 08:57"
| multikv If you have Splunk_TA_nix installed on your forwarders, your heavy forwarders, your indexers, and your search heads, then everything is "OK." Technically, your heavy forwarders are doing the parsing work instead of your indexers; however, if your indexers are Linux, you probably want to run lastlog.sh on them, too. Eventually, though, you'll realize that auditd, /var/log/auth.log, /var/log/secure, etc. are better sources of login data, although /var/log/wtmp and lastlog.sh are useful if you want regular snapshots of login times relative to wtmp retention on your hosts. You may find that log rotation for wtmp is either misconfigured or not configured at all on older Linux hosts.
... View more