Getting Data In

Help with extracting and filtering header preambles



We have an ugly custom log file, and we'd like to filter out the beginning of the file. We'd like to start from the first line, down to the first line with a valid timestamp. Is that possible?

Here's a sample:

Lots of lines like this:

Application Name:               Email_MMK_Node3_PR
Application Type:               EMAIL_SERVER
Application stuff....
Application Options: {
  { pop-client ['password' [output suppressed], 'move-failed-ews-item' [str] = "true", 'protocol-timeout' [str] = "00:05:00", 'maximum-msg-size' [str] = "5", 'type' [str] = "IMAP", 'address' [str] = "#", 'exchange-version' [str] = "Exchange2010_SP2", 'endpoint' [str] = "default", 'folder-path' [str] = "INBOX", 'leave-msg-on-server' [str] = "false", 'folder-separator' [str] = "/", 'port' [str] = "995", 'delete-bad-formatted-msg' [str] = "false", 'failed-items-folder-name' [str] = "", 'pop-connection-security' [str] = "none", 'enable-debug' [str] = "false", 'connect-timeout' [str] = "00:00:30", 'cycle-time' [str] = "00:00:30", 'mailbox' [str] = "#", 'enable-big-msg-stripping' [str] = "false", 'server' [str] = "", 'delete-big-msg' [str] = "false", 'enable-client' [str] = "false", 'allow-bad-msg-size' [str] = "false", 'maximum-msg-number' [str] = "500", ]}
  { pop-client-aaargprations ['password' [output suppressed], 'move-failed-ews-item' [str] = "true", 'protocol-timeout' [str] = "00:05:00", 'maximum-msg-size' [str] = "5", 'type' [str] = "IMAP", 'address' [str] = "", 'exchange-version' [str] = "Exchange2010_SP2", 'endpoint' [str] = "blahblahIn_Endpoint", 'folder-path' [str] = "INBOX", 'leave-msg-on-server' [str] = "true", 'folder-separator' [str] = "/", 'port' [str] = "993", 'delete-bad-formatted-msg' [str] = "false", 'failed-items-folder-name' [str] = "failedItems", 'pop-connection-security' [str] = "ssl-tls", 'connect-timeout' [str] = "00:00:30", 'enable-debug' [str] = "false", 'cycle-time' [str] = "00:00:30", 'mailbox' [str] = "1234rkrgprations", 'enable-big-msg-stripping' [str] = "false", 'server' [str] = "", 'delete-big-msg' [str] = "false", 'enable-client' [str] = "false", 'allow-bad-msg-size' [str] = "false", 'maximum-msg-number' [str] = "500", ]}

23:11:14.972 Dbg 29999 [EmailServer] Configuring 'MESSAGE_SERVER' connection

Any suggestions?

0 Karma

Revered Legend

Try something like this

props.conf on Indexer/Heavy forwarder

LINE_BREAKER = ([\r\n]+)\d+\:\d+\:\d+
...other time format settings---
TRANSFORMS-removeheaderevent = setnull

transforms.conf on Indexer/Heavy forwarder

REGEX = ^\w+
DEST_KEY = queue
FORMAT = nullQueue


Can you elaborate? What is the regex doing?

0 Karma

Revered Legend

The props.conf is splitting your logs in the events, where events will start with timesamp (23:11:14.972 in above example). This will give one extra large events with all the header preamble text. The TRANSFORMS will just find that huge header event, which I assume start with some word and not with timestamp, and will drop that events. (see this for transforms usage

0 Karma


Thanks. Interesting approach. Never considered it.

0 Karma

Ultra Champion

For anyone who stumbles on this in the future with similar questions, remember that you can tinker with the sourcetype definition with the Add Data wizard. The Advanced panel of the Set Source Type menu is where you can tinker and see how splunk would interpret the results.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!