Getting Data In

Forwarder and props.conf troubleshooting

dpadams
Communicator

I've been having trouble getting a host override transformation in my props.conf/transforms.conf to work and want to figure out if the problem is in the config files or in the running environment. I've got a centralized log server (forwarder) that writes out logs with data drawn from multiple hosts. We want to override the default host so that the indexer receiving the logs shows the original host name, not the log server's name. As I understand it:

  • Performing a host override is possible and normal.

  • The common practice is to run the transform on the indexer. For internal policy reasons, this is either impossible or at least not likely.

  • Only a 'heavy' forwarder will process the transform, not a light or universal forwarder.

What I'm trying to confirm now:

  • What kind of forwarder is running on my centralized log server?

  • Is there a way to get some feedback from the forwarder regarding the application of transforms? splunkd.log doesn't seem to have any relevant entries.

In other words, how can I check that the forwarder is a heavy forwarder and how do I check what (if any) transformations are being applied?

For info:

splunk enable app SplunkForwarder

The SplunkForwarder is listed as unconfigured, enabled, and invisible. I've got config files in place and the Web management GUI lists the forwarding rule and says that it's enabled. That doesn't add up to a coherent picture for me. Is this what a heavy forwarder should look like?

splunk list-forward-server

This returns "Active forwards: None" but "Configured but inactive forwards" lists the forward that I'm using. Events are forwarding without any obvious problem.

I think that the SSL part of the forward may not be working correctly, if that's relevant.

Other info:
* Events are flowing through to the Splunk indexer. So, forwarding is happening.

  • I'm using custom logs with custom types defined by the Splunk forwarder. These types are recognized by the indexer so I take it that my props.conf and inputs.conf are in the right locations.
0 Karma

emotz
Splunk Employee
Splunk Employee

For most things - you can go here and test out your regular expressions.
http://gskinner.com/RegExr/
The site lists help and useful information on the page next to the various expressions.
This might work, but in the editor it was only matching the last line. Because you said the host was near the end of the line, i matched it using the anchor of $ - or end of line. the parenthesis are around the capture group we are interested in hear - the host.

([a-zA-Z0-9|-]+)\s\w+$

Try it out.

0 Karma

emotz
Splunk Employee
Splunk Employee

so in your example
logs/vm-1/
access_log.txt
action_log.txt
error_log.txt

logs/vm-2/
access_log.txt
action_log.txt
error_log.txt

logs/vm-3/
access_log.txt
action_log.txt
error_log.txt

your montior clause would look like this
[monitor:///logs/*/access_log.txt]

sourcetype = access

index=xyz

host_segment=2

So you don't need to restart the forwarder every time you add a device. It is a wildcard and should work just fine.

not sure how you are rolling your logs, but you will want to capture the rolled logs too. Typically the current log and the most recently rolled log should handle it. All of the other/older logs should be removed. If you are going to zip them along the way, be sure to blacklist *.gz or whatever from your monitor clause or we will reindex it all as it is coming from a different source.

You also want to make sure that you limit the number of files being monitored by a single forwarder. in this configuration, you should be able to use the Universal Forwarder as it is just inputs and outputs.

0 Karma

dpadams
Communicator

Thanks for your comment about the faulty regex pattern. I tried it with rex against _RAW doing an extract and it seemed to work. Obviously, I wouldn't be posting if everything were working... Pulling the data from the log is very much my preferred solution.

In this log, the host name is almost at the end of the line. Below are the values from the sample posted earlier:

345101-VM3
345999-VM4
SRST-Remote-2

Is there a smart strategy for testing out regex patterns for use in this situation?

0 Karma

emotz
Splunk Employee
Splunk Employee

It is possible to extract the hostname from the data itself. In your sample data - which field is the hostname for action_log.txt? Your regex is not correct.
Regex needs to match everything without () and what is inside of () will be your first capture group equivalent to $1.

0 Karma

dpadams
Communicator

Thanks for the answer! It sounds like what you describe will work in my case and I like being able to use a Universal Forwarder. This will take a bit of a rework on the logging machine as it will need to detect when a new host has been introduced into the data. I'd still prefer to do it using a transformation but that just doesn't seem to be happening.

Thanks again!

0 Karma

dpadams
Communicator

Thanks very much for answering. I'm posting an answer as the details take more than a comment will hold.

All of our logs are custom, such as "action log", "error log", "system check log" and "access log". In other words, I'm not dealing with syslog-ng or any log type that Splunk already has a sourcetype definition for. I've seen mention of having a directory per machine. In my case, would that end up something like this?

logs/vm-1/
   access_log.txt
   action_log.txt
   error_log.txt

logs/vm-2/
   access_log.txt
   action_log.txt
   error_log.txt

logs/vm-3/
   access_log.txt
   action_log.txt
   error_log.txt

etc. Hopefully, I'm getting this wrong. A setup like the above would be possible but would require reconfiguring inputs.conf and restarting Splunk each time we change the roster of contributing machines. That would work poorly for us as it's way too much hands-on activity. What I'm really after is overriding the host by extracting the value from the log. I'd like (probably need, given organizational constraints) to do this on the forwarder, not the indexer. The position of the host in any particular log is predictable. (And in our control.) I've been through the docs and threads here over and over with no joy so I must be missing something:

  • My regex is wrong? (It seems right using rex)
  • My props.conf/transforms.conf settings are being ignored because I'm not a heavy forwarder?
  • Something else?

Here are some file samples, cut down to focus on just one of the logs. In fact, there are a bunch of logs with the same overall approach (central log server, host name inside of the log data, need to do an override.)

# action_log.txt sample
[23/Jun/2012:01:50:06 +0000] add appuser FANGLET-AU000-0000000000 0 1336643d3082237d75191d4362fbd941 - 1.0 - 345101-VM3 SRST
[23/Jun/2012:01:51:38 +0000] add appuser FANGLET-US000-0000000000 0 9fb0638027e115dc36a313700ada3f54 - 1.1.4 - 345101-VM3 SRST
[23/Jun/2012:01:51:53 +0000] add appuser FANGLET-AU-EGGPLNT 0 d1128ee5236b17a41825832b890a8091 cdma_spyder 1.0 10 345101-VM3 SRST
[23/Jun/2012:01:52:47 +0000] add appuser FANGLET-AU000-0000000000 0 5d3ded5a3efbc9102c85e319d08c461d - 1.0 - 345101-VM3 SRST
[23/Jun/2012:06:48:04 +0000] add appuser FRINDO-UK-EGGPLNT 0 c9e9d9c86fe1592e3427592c4c4bc6a buzz 1.0 8 345999-VM4 SRST
[23/Jun/2012:06:48:20 +0000] add appuser FANGLET-AU000-0000000000 0 d0e3cc86221875df6485f28e6246bcf8 - 1.0 - 345999-VM4 SRST
[23/Jun/2012:06:48:56 +0000] add appuser FRINDO-US000-0000000000 0 459d7c547c40efa025feb0ea9fd93998 - 1.1.4 - 345999-VM4 SRST
[23/Jun/2012:06:48:57 +0000] add appuser FRINDO-US000-0000000000 0 8321965193395108fe7d85878f8c9a43 - 1.1.4 - SRST-Remote-2 SRST

# inputs.conf
[monitor://C:\Program Files\xyz\logs\action_log.txt]
disabled = false
index = xyz
followTail = 0
sourcetype = action

# props.conf
[source::.../action_log.txt]
TRANSFORMS-action-host=action_host_override
sourcetype = action
TZ = UTC

# transforms.conf
[action_host_override]
DEST_KEY = MetaData:Host
REGEX = (?i)^(?:[^ ]* ){10}([^ ]+)
FORMAT = host::$1

# outputs.conf
[tcpout]
defaultGroup = lb_9997
disabled = false
maxQueueSize = 1000
indexAndForward = false
forwardedindex.filter.disable = true

[tcpout:lb_9997]
disabled = false
server = x.y.z:9997
autoLB = true
autoLBFrequency = 60
compressed = true

[tcpout-server://x.y.z:9997]
altCommonNametoCheck = idx
sslCertPath = $SPLUNK_HOME/etc/auth/server.pem
sslCommonNametoCheck = x.y.z
sslPassword = $1$Q929lfZOAu5w
sslRootCAPath = $SPLUNK_HOME/etc/auth/cacert.pem
sslVerifyServerCert = false

Thanks again for writing and trying to help! There's a bit more on my transforms.conf efforts here.

0 Karma

emotz
Splunk Employee
Splunk Employee

The Splunk heavy forwarder and the lightweight forwarder are both apps included by default in the standard Splunk install. The same install that contains the search head and the indexer includes forwarding capabilities. These apps can be seen in the manager >> apps listing.

Could you please include a sample of the logs, directory listing, inputs.conf, props.conf, transforms.conf and outputs.conf and anything else you think is useful.

The best practice for rsyslog and syslog-ng is to write out logs in a separate directory per device. for example /logs/cisco/network_or_firewall/messages files are written. in the inputs.conf you include

[monitor:///logs/cisco/*/]

index = your_index

sourcetype = cisco

host_segment = 3

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...