Splunk Search

How to host regex from file input?

icewolf69
Loves-to-Learn Everything

Hey All, 

 

I'm really struggling here.  I'm trying to get a universal forwarder to pull in txt logs, and edit the "host" field based on the filename/file path.

Example file path:

C:\SCAP_SCANS\Sessions\2023-02-04_1200\SERVER-test_SCC-5.7_2023-02-04_111238_Non-Compliance_MS_Windows_10_STIG-2.7.1.txt

 

Inputs.conf stanza:

[monitor://C:\SCAP_SCANS\Sessions]
disabled = false
ignoreOlderThan = 90d
host_regex = [^\\\]+(?=_SCC)
SHOULD_LINEMERGE = true
MAX_EVENTS = 500000
index = main
source = SCC_SCAP_TXT
sourcetype = SCC_SCAP_TXT
whitelist = (Non-Compliance).*\.(txt)

 

Tried a few different regex's.  Checked btool to make sure there aren't any configs overwriting settings.  Tried with and without transforms and props files.  Verified regex works using the path and a makeresults query.

Anyone have any suggestions?

Labels (3)
0 Karma

icewolf69
Loves-to-Learn Everything

It requires a capture group? Do you have to set a specific variable for that?

C:\SCAP_SCANS\Sessions\2023-02-04_1200\SERVER-test_SCC-5.7_2023-02-04_111238_Non-Compliance_MS_Windows_10_STIG-2.7.1.txt

Bold and underlined is the server name.

The following regex works fine on a makeresults:
(?<host>([^\\\\]+(?=_SCC)))

0 Karma

richgalloway
SplunkTrust
SplunkTrust

No need to set a variable (perhaps not even allowed).  The first capture group becomes the host name.

This regex is more efficient, according to regex101.com

\\([^\\\\]+)_SCC
---
If this reply helps you, Karma would be appreciated.
0 Karma

icewolf69
Loves-to-Learn Everything

Ok I think i found out what the issue is.

After changing the host_regex to just "(................)" to see what information is getting fed into splunk for that data, it showed only "source::SCC_SCAP" which means it's getting the data from my stanza configuration of 
Source=SCC_SCAP

When I removed that line, i started getting "source::<full_file_path>"

So the issue was less about the regex not working, and more it was failing everytime and just defaulting back to the actual hot.

But now i'm not sure how to fix the next problem.  I don't want a million *.txt files inside the "Sources" sections of the databases.  I want all of these text logs in a singular Source.  But i guess if no one knows how to keep those unified without declaring the source in the inputs.conf, I think i have to choose between regex_host and Source= 

 

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The host_regex setting requires a capture group.  The example setting does not have one.

Please specify what  part of the file path is the server name and we should be able to produce a regex for it.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...