Getting Data In

SEDCMD not replacing comment lines during indexing

54638
Explorer

I'm monitoring hosts files on Windows machines, but I don't want the comment lines when I ingest the file. However, my SEDCMD never seems to prevent the comment lines from being indexed.

My props.conf:

 [source::C:\\Windows\\System32\\drivers\\etc\\hosts]
 CHECK_METHOD = entire_md5
 SEDCMD-comments = s/\#.*\n//g

A sample of the standard hosts file. In this example, I only want the last line in my event, 255.255.255.255 wpad:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#   127.0.0.1       localhost
#   ::1             localhost
255.255.255.255 wpad

Any advice on where my SEDCMD is wrong? The command seems to work fine in a search when I run | rex mode=sed "s/\#.*\n//g"

0 Karma

woodcock
Esteemed Legend

What you really should be doing is stripping all of the wasteful comments like this:

[YourSourcetypeHere]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((?:^|[\r\n]+)(?:\s*#[^\r\n]*|\s*))

If you do that, then all you have left is the one line.

54638
Explorer

Thanks. I will try it, but what is the benefit of doing it this way as opposed to using SEDCMD? Doesn't SEDCMD strip the comments as well?

Also, if there are multiple lines at the end of the file, will each line show up as a different event this way?

0 Karma

woodcock
Esteemed Legend

It gets dropped as the very first step in processing, instead of a bit later ( SEDCMD ) or at the very end ( nullQueue ). It is also simpler. Yes, it should work for multiple non-comment lines.

to4kawa
Ultra Champion
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
SEDCMD-comments = s/#.*//g

maybe, \n is missing.
and try PREAMBLE_REGEX

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
PREAMBLE_REGEX = #
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...