Monitoring Splunk

When should I use a FIFO for an input?

Splunk Employee
Splunk Employee

I notice there is support for fifo's as inputs. Are there any benefits to using a fifo or is it just support for those cases where i dont want to log to a file? What are the downsides?

Tags (2)
1 Solution

Splunk Employee
Splunk Employee

On a reasonable OS and filesystem, I think you can get pretty reasoanble behavior for a file as well with small (under 4k) writes where you set the file to append mode, unbuffered and flush after each write. And your apps should really be doing that with logfiles anyway, if you want to find out when they crash what happened.

So I'm not sure the big deal with FIFOs is atomicity, though if you're sure of the behavior, go for it.. that sort of thing is pretty well outside the Splunk boundary.

Where I've foud FIFOs useful is when writing auomated inputs, like scripts. The FIFO acts as a flow control for your program, which lets you have a pretty good idea of when and how fast that data is getting into Splunk. It also allows you to be pretty lazy in your script authoring without much of a problem.

The most obvious downside is crashes. If the system crashes, or if splunk crashes, some data will be lost. Splunk can't put the data on disk from the FIFO before it reads it, and the OS isn't going to provide a backing disk store. Even if the source app has the data, there's no generic protocol for the program and splunk to renegotiate the position.

The second problem is debuggability. If something's going fishy with your datastream, the FIFO offers no clues. It's hidden from view.

View solution in original post

Splunk Employee
Splunk Employee

On a reasonable OS and filesystem, I think you can get pretty reasoanble behavior for a file as well with small (under 4k) writes where you set the file to append mode, unbuffered and flush after each write. And your apps should really be doing that with logfiles anyway, if you want to find out when they crash what happened.

So I'm not sure the big deal with FIFOs is atomicity, though if you're sure of the behavior, go for it.. that sort of thing is pretty well outside the Splunk boundary.

Where I've foud FIFOs useful is when writing auomated inputs, like scripts. The FIFO acts as a flow control for your program, which lets you have a pretty good idea of when and how fast that data is getting into Splunk. It also allows you to be pretty lazy in your script authoring without much of a problem.

The most obvious downside is crashes. If the system crashes, or if splunk crashes, some data will be lost. Splunk can't put the data on disk from the FIFO before it reads it, and the OS isn't going to provide a backing disk store. Even if the source app has the data, there's no generic protocol for the program and splunk to renegotiate the position.

The second problem is debuggability. If something's going fishy with your datastream, the FIFO offers no clues. It's hidden from view.

View solution in original post

Explorer

As long as the writes are less than 4k, usually, the operation is atomic. This allows multiple processes to use the same pipe - /var/log/splunk for example.

But this is a "I don't want to write to a file" type scenario.

The only other use that I can think if is letting Splunk observe communications between processes that use named pipes, and act on it, like logging xmlrpc calls.

I can't think of downsides - reads are blocking, you can't seek, the data can get lost if not cached on disk before it's read from the pipe. ACLs are limited to what the file system provides, but all those are only a limit if the pipes were to pretend to be files.