Archive
Highlighted

Could splunk be used as an ELT or ETL? What is the best Data Integration Studio for Splunk?

Builder

Hello Splunk Experts,

what is the best technique to integrate several CSV's arround 58 different type of sources from different machines and build one overall dashboard on the top of those sources, i have 2 options in this:

1- Splunk will monitor those entire CSV's and all transformation and cleansing will be made after using splunk searches and joins and in that case i'm using Splunk as an ELT ( Splunk Loading -> Splunk Transformation and Cleansing --> Data Visualization on Splunk )

2- Use a data integration tools ETL to transforms those several CSV's into more simplest CSV's arround 4 to 5 CSV's files and let splunk monitoring it.

My question is what is the best DI Studio tool for Splunk? what is the best approach? and what is the fast way to achieve this? I currently prefer the second option

Please give me your opinion,

Thanks,
Roy Imad

0 Karma
Highlighted

Re: Could splunk be used as an ELT or ETL? What is the best Data Integration Studio for Splunk?

Communicator

I am currently doing something close to this. I am using a python script as a scripted input that goes through several files and then echos to the standard input the transformed relevant parts in CSV lines.

This has the advantage of reducing the index volume of data, which makes you use less license and also takes up less storage space. This solution is also possible to implement using universal forwarders as long as the servers have python installed.

In fact it has worked so well, that I am thinking of adding more preprocessing python scripts to further reduce the amount of data that is being stored and indexed, because I am near my license limit.

Highlighted

Re: Could splunk be used as an ELT or ETL? What is the best Data Integration Studio for Splunk?

Ultra Champion

If you are interested in option 2 that you put forth , then you might be interested in our partner Pentaho's Data Integration tool that can extract/transform/filter your raw CSV's and push(load) the refined data directly into Splunk using a PDI Splunk Output Step.

alt text

View solution in original post