It appears that there are several ways to bulk export data from Splunk.
-rest API
-search query option: outputcsv
-cli search option: -output csv -maxout
-sendemail
I'm loading data into Splunk and there's a lot (in the millions) of records/events I would like to export as a flat file, ideally to a CSV. I plan on using these flat files as a source in our ETL jobs to populate our data warehouse.
The Splunk searches grab only certain fields I extracted from the inputs.
I know there are some hard limits to certain procedures regarding how many records we can extract per search.
What would be the best way to bulk export this data from Splunk?
I expect the number of records/events from search results should be in the millions.
Thanks!
There is another choice: you can set your indexer(s) to forward data to "3rd party systems." This will resend the raw data. See http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Forwarddatatothird-partysystemsd for details.
You could get around the "millions of events" problems with the other methods, such as outputcsv. Just export more frequently (like every hour or even more often).
I am not sure what the upper bound is -- I don't know that there is a limit for outoutcsv, since it simply writes to a file on the splunk server.
Thanks for the info. I was hoping I could extract fields out of the raw and only bulk export certain columns out of the indexed data. I don't think forwarding raw data to a 3rd party system gives me much room for customizing how the flat file should look.
Maybe exporting more frequently is the way to go with this large amount of data. Do you know what the limit is for outputcsv? Maybe the cli search is better suited since it has the -maxoutput 0 option.