Hi,
I'm searching for the documentation for the new 6.5 hadoop data roll feature, and unable to find it. Can someone point me to it? Or where it's setup within Splunk? Nothing obvious stands out.
About archiving indexes with Hadoop Data Roll
http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/ArchivingindexestoHadoop
About archiving indexes with Hadoop Data Roll
http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/ArchivingindexestoHadoop
Searching on docs.splunk.com for hadoop data roll should turn this topic right up.
Thanks. I'll do that from now on - Mr. Google didn't find it.
Thanks. Couldn't find that via Mr. Google....
So, next stupid question - the doc doesn't indicate that Hunk is required to search this data after it's archived. Is that accurate? I can query my data in hadoop without requiring Hunk?
i think that is accurate.
You can search archived buckets as you normally search, simply include the archive virtual index in your searches. See Search archived index data (http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/Archivesearchtips) for information about search commands that work with indexes stored in Hadoop.
You can for example, create one search that searches Splunk for:
Data in a Splunk Enterprise index.
Archived data copied into HDFS or S3.
Holy moly! So.......... next question - just data that was once in Splunk, or any data that is now in Hadoop?
per my understanding, just the data that was once in splunk, now archived into HDFS/S3.
Yeah... you have paid for the data to be indexed in Splunk Enterprise...
You don't have to pay for the archived data again.
If you ingest data directly into HDFS (using Flume e.g.) you haven't paid in Splunk land... you'll need a license for Splunk Analytics for Hadoop, formerly known as HUNK :-).
Does it make sense?
Figured. (That would have been too good to be true).
Still, it's a big help!
thanks, can you please accept this as the answer..