Best practice for indexing files with headers (pre...

threatanalyst · ‎05-31-2018

I have been trying to understand when it is best practice to use PREAMBLE_REGEX, FIELD_HEADER_REGEX, and/or HEADER_FIELD_LINE_NUMBER when indexing files with headers. I couldn't find in the documentation answers to some of the following questions:

Will one attempted behavior ever "override" anther?
If I use them all, which order do they take priority (listed order, some other order)?
Is it best to only use the minimum number of settings required, or should I always try to set all of them?
If a file without actual events still contains the header, how do I avoid Splunk registering the header as a separate event?

For example, I'm trying to parse the following sample output from TZWorks..

usp - full ver: 0.52; Copyright (c) TZWorks LLC
License #-------------- is authenticated for business use and registered to --------------
run time: -------------- [UTC]; Host: -------------
"cmdline: C:\--------------\usp64.exe -csvl2t -partition C:"
note: When comparing timestamps from manual analysis use option [-show_other_times] to see full range of timestamps recovered

date,time,timezone,MACB,source,sourcetype,type,user,host,short,desc,version,filename,inode,notes,format,extra
$sampledata...

I set up the following lines in props.conf (among other settings):

[usp]
PREAMBLE_REGEX = ^(usp|License|run|\"cmdline|\s*$)
FIELD_HEADER_REGEX = ^date
HEADER_FIELD_LINE_NUMBER = 7

These settings seem to work as long as the event files are consistent with the sample above. However, when no events are found, neither the header field ("date,time,timezone... etc.") nor the $sampledata exists, and Splunk interprets the first 5 lines as an actual event when indexing. Is there a better way to approach this in general that might also help solve my issue when the file does not contain events?

richgalloway · ‎06-01-2018

The docs say the FIELD_HEADER_REGEX value is not included in the headers so your current setting shouldn't work. That it does work tells me that field is trumped by one of the other two.

---
If this reply helps you, Karma would be appreciated.

Best practice for indexing files with headers (preamble_regex, field_header_regex, header_field_line_number)

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future

Join the Conversation