Getting Data In

SEDCMD not replacing comment lines during indexing

54638
Explorer

I'm monitoring hosts files on Windows machines, but I don't want the comment lines when I ingest the file. However, my SEDCMD never seems to prevent the comment lines from being indexed.

My props.conf:

 [source::C:\\Windows\\System32\\drivers\\etc\\hosts]
 CHECK_METHOD = entire_md5
 SEDCMD-comments = s/\#.*\n//g

A sample of the standard hosts file. In this example, I only want the last line in my event, 255.255.255.255 wpad:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#   127.0.0.1       localhost
#   ::1             localhost
255.255.255.255 wpad

Any advice on where my SEDCMD is wrong? The command seems to work fine in a search when I run | rex mode=sed "s/\#.*\n//g"

0 Karma

woodcock
Esteemed Legend

What you really should be doing is stripping all of the wasteful comments like this:

[YourSourcetypeHere]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((?:^|[\r\n]+)(?:\s*#[^\r\n]*|\s*))

If you do that, then all you have left is the one line.

54638
Explorer

Thanks. I will try it, but what is the benefit of doing it this way as opposed to using SEDCMD? Doesn't SEDCMD strip the comments as well?

Also, if there are multiple lines at the end of the file, will each line show up as a different event this way?

0 Karma

woodcock
Esteemed Legend

It gets dropped as the very first step in processing, instead of a bit later ( SEDCMD ) or at the very end ( nullQueue ). It is also simpler. Yes, it should work for multiple non-comment lines.

to4kawa
Ultra Champion
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
SEDCMD-comments = s/#.*//g

maybe, \n is missing.
and try PREAMBLE_REGEX

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
PREAMBLE_REGEX = #
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...

Share Your Feedback: On Admin Config Service (ACS)!

Help Us Build a Better Admin Config Service Experience (ACS)   We Want Your Feedback on Admin Config Service ...