Getting Data In

Extracting hostname from filename - inputs.conf on UF - host_regex issue

dewald13
Path Finder

Having an issue with bluecoat logs that are dropped on a server with a UF. Attempting to extract the hostname with the following:

host_regex = /logs/rsyslog/bclogs/(.*)-\d{6}[.]log[.]gz

Checked this regex in regexr and it works perfectly.


Sample file names - Host format (ABC-G-PXYW-XXX)

/logs/rsyslog/bclogs/ABC-G-PXYW-002-032016.log.gz
/logs/rsyslog/bclogs/AEC-G-PXYW-001-032016.log.gz
/logs/rsyslog/bclogs/ABC-G-PXYW-002-032014.log.gz
/logs/rsyslog/bclogs/DEF-G-PXYW-003-032016.log.gz

The host is coming in set as the name of the log server, rather than the name.

Thoughts?

1 Solution

bwooden
Splunk Employee
Splunk Employee

If you've restarted your forwarder and don't have any host overrides on your parser/indexer, your regex should work. As should something like this:

host_regex=/logs/rsyslog/bclogs/([\w-]+)(?=-\d{6}\.log\.gz)

View solution in original post

bwooden
Splunk Employee
Splunk Employee

If you've restarted your forwarder and don't have any host overrides on your parser/indexer, your regex should work. As should something like this:

host_regex=/logs/rsyslog/bclogs/([\w-]+)(?=-\d{6}\.log\.gz)

dewald13
Path Finder

That worked with the "/"

Thanks!

0 Karma

dshpritz
SplunkTrust
SplunkTrust

There may also be some metadata rewrites happening, depending on the sourcetype (for example, the syslog sourcetype has built in rewrites).

0 Karma

dshpritz
SplunkTrust
SplunkTrust

Just for a sanity check, has the UF been restarted? The regex looks correct. The other thought is that the system doing the parsing (Heavy Forwarder or Indexer) is overwriting it.

0 Karma

dewald13
Path Finder

Try this one more time.
"^\/logs\/rsyslog\/bclogs\/(.*)-d{6}[.]log[.]gz"

0 Karma

dshpritz
SplunkTrust
SplunkTrust

You need two backlashes for it to display correctly on Splunkbase:
host_regex = ^/logs/rsyslog/bclogs/(.*)-\d{6}[.]log[.]gz

(bitten me tons of times)

dewald13
Path Finder

the site is ripping out the backslashes...

"^\/logs\/rsyslog\/bclogs\/(.*)-\d{6}[.]log[.]gz"

0 Karma

dewald13
Path Finder

This is the current inputs.conf on the Universal Forwarder

index = proxysg
sourcetype = squid
ignoreOlderThan = 60m
disabled = false
host_regex = /logs/rsyslog/bclogs/(.*)-\d{6}[.]log[.]gz

0 Karma

kristian_kolb
Ultra Champion

You're not changing the source are you? See below.

host_regex = <regular expression>
* If specified, <regular expression> extracts host from the path to the file for each input file. 
    * Detail: This feature examines the source key, so if source is set
      explicitly in the stanza, that string will be matched, not the original filename.
* Specifically, the first group of the regex is used as the host. 
* If the regex fails to match, the default "host =" attribute is used.
* If host_regex and host_segment are both set, host_regex will be ignored.

Please post the full inputs.conf stanza for the bc logs.

/k

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) v3.54.0

The Splunk Threat Research Team (STRT) recently released Enterprise Security Content Update (ESCU) v3.54.0 and ...

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...

New Learning Videos on Topics Most Requested by You! Plus This Month’s New Splunk ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...