Not possible to stream data - can I create files t...

belka · ‎12-05-2013

I have a lot of instrumentation data from remote sites that I am collecting in Splunk. This data needs to go to the main index in the Enterprise, but I cannot stream data from remote sites through the firewall, etc. due to security concerns. I am proposing collecting with a Heavy Forwarder, but I need to create files from the collected data so it can be sent through the firewall, scanned, and then put in the main indexer farm. Should I collect it and index at the HF and then forward the files, or collect raw data in files and then send it through the firewall and then into the main indexer farm?

Thanks for any help. Not having much luck so far and I don't want to maintain multiple Splunk systems/dashboards, etc. if I don't have to.

norbert_hamel · ‎08-15-2014

The HF in front of the Firewall could index the data, let's say in index=intermediate. Then you could have scheduled searches, for example every five minutes, which would search the events from index=intermediate and write this search result without any processing locally on the HF to CSV files. You can use the outputcsv cmmand for that.The name of the csv file can be generated dynamically during search, so you could have individual csv file names for each output, e.g. containing the timestamp of the search, similar as you would have with rotating log files.

Finally you will have a steadily growing number of csv files locally on your HF, which could be passed through the firewall to the indexer, for eaxmple using scp or rsync or whatever makes sense (note that I am not the Linux expert 🙂 ). Or if you don't want to have those files on the indexer, you could move them from HF to any other Linux behind the filrewall and use a Universal Forwader from here. On the indexer, you you proces those CSV files as a usual file input.

I have currently no access to my Splunk, in case you need help with the dynamic CSV file names please get back, I am happy to help.

jtrucks · ‎12-05-2013

if you want to minimize the Splunk install footprint in your environment, use a log collector to create the raw files, then have a process that shovels them through the firewall (via push or pull depending on your topology and firewall rules). After this, just have Splunk read whatever data files land in the directory you drop these on the indexer. Splunk will then automagically suck it all in for you.

--
Jesse Trucks
Minister of Magic

belka · ‎12-05-2013

can a HF collect raw, unindexed files and then shovel it through?

Not possible to stream data - can I create files that I load into indexer?

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Join the Conversation

Not possible to stream data - can I create files that I load into indexer?

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...