I have a use case where in we have around 0.5 TB of raw data coming in on daily basis that needs to be analyzed /searched
We have a Splunk Enterprise license , so was thinking to use Splunk for same, by storing this data on file system and then get those files indexed in Splunk. Just wondering if this is an efficient way.
Analysis done so far with other approaches:
1) Using Hunk (Can't go for a licensed solution. hence crossing this )
2) Using Splunk Analytics for Hadoop (I guess its just a new name for HUNK. We still need to get a license for this ?
Also it look like its an Add On so do we still need to purchase it or is it free to download.)
3) Storing data on HDFS and then using Splunk Hadoop Connect to index the hdfs data for searching.
Any suggestions w.r.t to these approaches will be helpful .
Yes Hunk is the older name for Splunk Analytics for Hadoop. They are both licensed the same.
Splunk Analytics for Hadoop is already part of normal Splunk, so you do not need to install any additional Splunk software (you do need Hadoop and Java on the Search Head)
Using Splunk Hadoop Connect will copy the files from HDFS to Splunk indexers. Splunk Analytics for Hadoop will not index the data in Splunk, but will run MR jobs on the Hadoop cluster and will return the results.