Splunk Search

How to host regex from file input?

icewolf69
Loves-to-Learn Everything

Hey All, 

 

I'm really struggling here.  I'm trying to get a universal forwarder to pull in txt logs, and edit the "host" field based on the filename/file path.

Example file path:

C:\SCAP_SCANS\Sessions\2023-02-04_1200\SERVER-test_SCC-5.7_2023-02-04_111238_Non-Compliance_MS_Windows_10_STIG-2.7.1.txt

 

Inputs.conf stanza:

[monitor://C:\SCAP_SCANS\Sessions]
disabled = false
ignoreOlderThan = 90d
host_regex = [^\\\]+(?=_SCC)
SHOULD_LINEMERGE = true
MAX_EVENTS = 500000
index = main
source = SCC_SCAP_TXT
sourcetype = SCC_SCAP_TXT
whitelist = (Non-Compliance).*\.(txt)

 

Tried a few different regex's.  Checked btool to make sure there aren't any configs overwriting settings.  Tried with and without transforms and props files.  Verified regex works using the path and a makeresults query.

Anyone have any suggestions?

Labels (2)
0 Karma

icewolf69
Loves-to-Learn Everything

It requires a capture group? Do you have to set a specific variable for that?

C:\SCAP_SCANS\Sessions\2023-02-04_1200\SERVER-test_SCC-5.7_2023-02-04_111238_Non-Compliance_MS_Windows_10_STIG-2.7.1.txt

Bold and underlined is the server name.

The following regex works fine on a makeresults:
(?<host>([^\\\\]+(?=_SCC)))

0 Karma

richgalloway
SplunkTrust
SplunkTrust

No need to set a variable (perhaps not even allowed).  The first capture group becomes the host name.

This regex is more efficient, according to regex101.com

\\([^\\\\]+)_SCC
---
If this reply helps you, Karma would be appreciated.
0 Karma

icewolf69
Loves-to-Learn Everything

Ok I think i found out what the issue is.

After changing the host_regex to just "(................)" to see what information is getting fed into splunk for that data, it showed only "source::SCC_SCAP" which means it's getting the data from my stanza configuration of 
Source=SCC_SCAP

When I removed that line, i started getting "source::<full_file_path>"

So the issue was less about the regex not working, and more it was failing everytime and just defaulting back to the actual hot.

But now i'm not sure how to fix the next problem.  I don't want a million *.txt files inside the "Sources" sections of the databases.  I want all of these text logs in a singular Source.  But i guess if no one knows how to keep those unified without declaring the source in the inputs.conf, I think i have to choose between regex_host and Source= 

 

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The host_regex setting requires a capture group.  The example setting does not have one.

Please specify what  part of the file path is the server name and we should be able to produce a regex for it.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...