Getting Data In

SEDCMD not replacing comment lines during indexing

54638
Explorer

I'm monitoring hosts files on Windows machines, but I don't want the comment lines when I ingest the file. However, my SEDCMD never seems to prevent the comment lines from being indexed.

My props.conf:

 [source::C:\\Windows\\System32\\drivers\\etc\\hosts]
 CHECK_METHOD = entire_md5
 SEDCMD-comments = s/\#.*\n//g

A sample of the standard hosts file. In this example, I only want the last line in my event, 255.255.255.255 wpad:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#   127.0.0.1       localhost
#   ::1             localhost
255.255.255.255 wpad

Any advice on where my SEDCMD is wrong? The command seems to work fine in a search when I run | rex mode=sed "s/\#.*\n//g"

0 Karma

woodcock
Esteemed Legend

What you really should be doing is stripping all of the wasteful comments like this:

[YourSourcetypeHere]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((?:^|[\r\n]+)(?:\s*#[^\r\n]*|\s*))

If you do that, then all you have left is the one line.

54638
Explorer

Thanks. I will try it, but what is the benefit of doing it this way as opposed to using SEDCMD? Doesn't SEDCMD strip the comments as well?

Also, if there are multiple lines at the end of the file, will each line show up as a different event this way?

0 Karma

woodcock
Esteemed Legend

It gets dropped as the very first step in processing, instead of a bit later ( SEDCMD ) or at the very end ( nullQueue ). It is also simpler. Yes, it should work for multiple non-comment lines.

to4kawa
SplunkTrust
SplunkTrust
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
SEDCMD-comments = s/#.*//g

maybe, \n is missing.
and try PREAMBLE_REGEX

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
PREAMBLE_REGEX = #
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!