I have some computing antiques running Unix; I need to monitor some files on them, and get them into Splunk.
I read http://answers.splunk.com/answers/8328/best-practice-for-getting-data-into-splunk-without-a-forwarde... the "scripted/scheduled-copy-files-to-a-machine-that-does-have-a-forwarder" seems reasonable/doable, and is probably where I'll end up unless I find something better. BUT....
One thing I've been mentally toying with is running a (perl?) script to tail the files and ship them via TCP to an indexer listening on a port dedicated to the purpose.
I have a hard time believing that we are the first people going down this road; has anyone else done this?
has anyone cooked up any solutions other than the ones in http://answers.splunk.com/answers/8328/best-practice-for-getting-data-into-splunk-without-a-forwarde... ?
Ah. What kind of antique are we talking about?
very old AIX....
Don't forget that Splunk can be used as a Syslog server: http://docs.splunk.com/Documentation/Splunk/6.2.0/Data/SyslogTCP
(if your Splunk is not running as Root, just use iptables to redirect the TCP/514 and UDP/514 portd to the Splunk listening port)
Anyway, if you dislike the option above, do you have netcat available on the servers? Even if not, you might be able to easily compile and drop the binary on the servers: http://docs.splunk.com/Documentation/Storm/Storm/User/Howtoforwarddatavianetcat
You could simply use with tail as the example above or create a smarter script that runs every X seconds and sends only the deltas.
Well, I still would prefer to spend the time deploying a proper syslog deamon and use it, nothing complicated. You might find out other things you could be sending to that syslog server in the future to justify the effort.
the receiving end isn't the issue. the sending end is the issue.
netcat or socat can take care of the transport, now the issue is just having a tail-like utility that can persist it's view of where it left off (so we don't double index after a restart)...
The biggest reason I can see for deploying a syslog server that will forward to Splunk is that you will drop lots of syslog packets when you restart a splunk indexer because it takes so much time to restart one, and syslog usually is UDP, so there is no re-try if the send failed because the index server was down.
The syslog server running with a forwarder is a good option, if you don't want to use a forwarder. But a forwarder is always better. If your network goes down, a syslog server (or splunk relying on syslog) never gets the data, while the forwarders will always start from the point of disconnect to the present when the network comes back up. No lost data.
You can use the REST API to send data to Splunk, but it isn't much different from using a forwarder. We have Cloud Foundry servers that use the REST API because they have more control over the data sent.
Do your antiques speak syslog or a related network-based protocol?
the antiques do speak syslog, but do not have a syslog daemon available that allows one to send files via the syslog protocol.