Getting Data In

SEDCMD not replacing comment lines during indexing

54638
Explorer

I'm monitoring hosts files on Windows machines, but I don't want the comment lines when I ingest the file. However, my SEDCMD never seems to prevent the comment lines from being indexed.

My props.conf:

 [source::C:\\Windows\\System32\\drivers\\etc\\hosts]
 CHECK_METHOD = entire_md5
 SEDCMD-comments = s/\#.*\n//g

A sample of the standard hosts file. In this example, I only want the last line in my event, 255.255.255.255 wpad:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#   127.0.0.1       localhost
#   ::1             localhost
255.255.255.255 wpad

Any advice on where my SEDCMD is wrong? The command seems to work fine in a search when I run | rex mode=sed "s/\#.*\n//g"

0 Karma

woodcock
Esteemed Legend

What you really should be doing is stripping all of the wasteful comments like this:

[YourSourcetypeHere]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((?:^|[\r\n]+)(?:\s*#[^\r\n]*|\s*))

If you do that, then all you have left is the one line.

54638
Explorer

Thanks. I will try it, but what is the benefit of doing it this way as opposed to using SEDCMD? Doesn't SEDCMD strip the comments as well?

Also, if there are multiple lines at the end of the file, will each line show up as a different event this way?

0 Karma

woodcock
Esteemed Legend

It gets dropped as the very first step in processing, instead of a bit later ( SEDCMD ) or at the very end ( nullQueue ). It is also simpler. Yes, it should work for multiple non-comment lines.

to4kawa
Ultra Champion
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
SEDCMD-comments = s/#.*//g

maybe, \n is missing.
and try PREAMBLE_REGEX

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
PREAMBLE_REGEX = #
0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...