Are there any resources or white papers out there that talk about Splunk and LFI (Log File Injection)? Splunk makes it amazingly easy to push data in with minimal effort, but what if you want more control and assurance that input data has not been tampered with or modified? How do you prevent (or reduce the likelihood of) log file injection?
I've found a few non-Splunk publications on the topic and was curious how this looks in Splunk. Several of the kinds of attacks I've read about don't really work with Splunk, but some would, and quite frankly Splunk opens the door to a few new kinds of attacks not possible in the typical log-file approach. (Which I'm sure is true of other central logging tools as well.)
This question originally came up in a discussion about which Universal Forwarders are "allowed" to send data to Splunk. It looks like the options are limited to the
acceptFrom (network source restriction) and
requireClientCert for certificate-based restrictions. But even if both are enabled, that doesn't stop log injection from occurring if the application or OS producing the logs has been compromised. And in many cases, compromising an app is less the issue, just a poorly written app that doesn't do proper escaping is enough to create undesirable behavior. Since this topic is growing, and I'm the kind of person who desires answers to all of life's questions, I'm requesting help from those who are wiser than I!
BTW, I'm mainly looking at pre-indexing considerations. I know that Splunk has event signing and audit trails to protect the data once it's been indexed, but how how do you protect the data before that point? And from a search time perspective, how to you deal with the reality that this probably will occur at some point? Because it seems like if you aren't paying attention, this stuff would be really easy to miss.
Here are a few different specific security considerations / scenarios I've come up with. There's no guarantee that all of these approaches would even work, but the main point is that there are a wide number of possibilities for attack that are Splunk specific, but at this point I haven't seen much discussion about it.
***SPLUNK*** index=_internal" causing an event to only be assessable to admin users and not the network security team. (Sending events to a non-existent index may draw too much attention due to the resulting warning banners.)
_indextimeanalysis could find some anomalies, but it's not a direct comparison.) See the Log Injection Attack and Defense link below. Remember that indexed order doesn't always map directly to original log file sequence in a distributed Splunk environment.
requireClientCert=truewould prevent unauthorized connections, but it doesn't guarantee the host does lie about itself; for example, if a system containing a UF was hijacked. And unless Splunk augments the forwarder's signature hash to each event, it's difficult to track down where a given even originated.
NOT debug" (to exclude an useless message some developer refuses to disable them), then bypassing detection is a simple as injecting the word "debug" anywhere in the log message. (Using the search "
NOT log_level=DEBUG" is a step better, but may still be circumvented, depending on search time extraction settings.)
srcfield (which is really just an alias for
src_ip), then if somehow "
src=bogus" could be injected into the event, then the grouping logic could be controlled by an externally dictated value instead of the system captured IP address.
Your number one defense against this is to get the files off the source server as soon as possible and into Splunk. After that, you should have reasonable assurance that the log was as written. Any other tampering would result in
_indextime anomalies. These actually can be detected, e.g., by alerting on any event where
_indextime - _time is significantly different from the average
_indextime - _time. Most of the other "attacks" are just ways to try to hide log messages (e.g., trying to sneak "DEBUG" into the record of you unauthorized login) but there's really nothing you can do about that without being more rigorous in your known formats. Fundamentally that is equivalent to a vulnerability in your log formats and queries that you'd have to design to avoid, but fortunately that is actually easy to do if you're actually in a position to do anything about it. The reason you have all those "tricks" and tools in Splunk is basically because they're there for dealing with crap log formats, which, yes, are going to be vulnerable to this kind of thing.
I've attempted to re-ask a more specific question on the topic of compromised forwarders. More specifically, how would you track down the problem server: http://answers.splunk.com/answers/116535/how-to-track-down-a-dishonest-uf
I agree that much of this is a log creation problem and that getting the logs to a central (and secure) location ASAP resolves many issues. (It's not possible to hide your tracks my modifying logs if Splunk already indexed a copy.) That's huge. But anytime you depend on rule-based detection, someone clever can find away around it. Of course, Splunk shines because you can change the rules (search logic, knowledge objects, ...) and re-check for past anomalies. With that said, I was hoping someone wrote or complied a list of tips, techniques, methodologies, and best practices on the topic.