Getting Data In

How to track down a dishonest UF?

Lowell
Super Champion

Splunk allows you to assign host, source, and sourcetype (metadata) to all indexed events. These can be setup statically or dynamically in inputs.conf and they can be changed by transformers at index time. What techniques can you use to verify that accuracy of these metadata fields? More specifically, if an event comes in and the metadata seems wrong, what techniques can be used to identify and resolve this kind of a problem?

Every deployment is different, so a made a more specific scenario for the purpose of illustrating the problem. This focuses on the accuracy of the host field.

Let's say a Splunk system consists of 1000 servers each with a Universal Forwarders. All events are forwarded to one of 4 indexers using automatic load balancing, using SSL. One of the servers (containing a UF) has been compromised and the security team suspects an issue because of an increase in bogus anomalies being reported by Splunk. The Splunk admin noticed that data from new hosts has recently started coming in for hosts which don't yet have a UF deployed. Weird. Upon further investigation, the only events for these hosts are "bogus" and frequently trigger false-positives. The Splunk admin team suspect that a UF is lying about it's "host" and is intentionally sending bogus data, presumably to coverup true intentions by causing distractions. Your a Splunk admin, how do you track down which of the 1000 UFs has been compromised? (You have Splunk admin and root access to the Splunk indexers, but not to the other servers on the network.)

lguinn2
Legend

As a complement to other ideas: use the acceptFrom attribute in inputs.conf on the indexers. Each indexer will have a [splunktcp://<port>] stanza. Add acceptFrom to this stanza to limit the forwarder connections. I am not sure which protocol layer is used here, but I think it is TCP and not Splunk settings which determine this.

From inputs.conf.spec --

acceptFrom = <network_acl> ...
* Lists a set of networks or addresses to accept data from.  These rules are separated by commas or spaces
* Each rule can be in the following forms:
*   1. A single IPv4 or IPv6 address (examples: "10.1.2.3", "fe80::4a3")
*   2. A CIDR block of addresses (examples: "10/8", "fe80:1234/32")
*   3. A DNS name, possibly with a '*' used as a wildcard (examples: "myhost.example.com", "*.splunk.com")
*   4. A single '*' which matches anything
* Entries can also be prefixed with '!' to cause the rule to reject the
  connection.  Rules are applied in order, and the first one to match is
  used.  For example, "!10.1/16, *" will allow connections from everywhere
  except the 10.1.*.* network.
* Defaults to "*" (accept from anywhere)

Of course, you need to be careful that this list of accepted connections corresponds to the list of forwarders in your forwarder management (or deployment server) configuration. Otherwise, you could configure forwarders to send data to indexers, while the indexers are not configured to accept connections from the forwarders - yikes!

lguinn2
Legend

It would stop a bogus machine from joining the Splunk environment, unless it was spoofing the IP address. It wouldn't stop an "accepted" machine from lying about its metadata (host, etc).

0 Karma

Lowell
Super Champion

Yeah, this would have to be at the TCP layer controlling the inbound connections to the splunktcp port. Fundamentally I don't think this would stop a forwarder from lying about itself, would it? And the same is true with requireClientCert. Once a forwarder is connected (after coming from an approved network source and/or having the right cryptographic signature), it can say whatever it wants about who it is. (Much like email or snail mail can be given a bogus "from" address.)

0 Karma

dstaulcu
Builder

deploy an app having a scripted input that outputs true hostname. look for mismatches of host field and host name in message of script-based input source.

or

deploy an app having scripted input that retroactively searches for log entry in question (message, time, type, etc) and then returns "host found! with actional computername"

lguinn2
Legend

Nice idea. Be sure to create a serverclass of "everyone" to make sure that every client gets the app.

Although it would be possible for a rogue forwarder to avoid this app simply by turning off the deployment client configuration.

0 Karma

lguinn2
Legend

Great question. Mostly commenting to bump it, though I am thinking about it...

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...