Looking to set up a Heavy Forwarder as a data processing server. We get data logs in a specific format dropped on our production machines, but it needs to be opened and converted to CSV by a special script. Since the data originally drops on the production servers, there is also a process that follows it on the same server and converts it to CSV, which is then picked up by the Universal Forwarder and sent to the Indexer.
I would like to move the conversion process from the production server and save potential performance impact and push these to a Heavy Forwarder, which would then execute the script and send the results forward to the Indexer. Is it possible to do this with Splunk Forwarders, or do I need to look into an automated copy process from the production servers to the processing server?
not really the good place for that, as the heavy forwarder parse the events, but do not index them, so you cannot do postprocess based on search or scripts.
A cleaner approach is to create a scripted input that will will tail your logs files, parse, format and produce a clean CSV file, that then you pass to the forwarder to index.
As stated before there is already something doing clean up of the script on the production server, but I wish to move it off the production server to reduce processing on the production server as clients are connected to it. Also, according to documentation Heavy forwarders do have the ability to index. I can introduce another process to get around this, but I want to see if the current technology (Splunk) that we have the server is capable of forwarding the data to another server and execute the post-process after it is received by the Intermediate server. Also, the file unable to be tailed, since it is in a proprietary compressed file from the application, which is why we have a special script from that application that can uncompress the file. I would like to setup a Intermediate forwarder that captured these files and preprocess and forward the results to the indexers. If this is not possible then that's fine, just need to know.
My only other method to solve this is to share SAN space for all the servers and have the logs dumped on the SAN and have the Intermediate server just watch for new logs and do scripted inputs.