Hunk bucket archive question?

tsunamii · ‎03-22-2016

When HUNK does its bucket pushes to HDFS, it also pushes a couple small supporting files, metadata, etc... With Hadoop's issues handling small files, I was wondering if that is something that's been looked at or not?

For using the HUNK archiving on clustered indexers, I understand that if the buckets have 2 or greater searchable copies of the data, searches against that archived data will return duplicated results, is that correct?

hsesterhenn · ‎03-22-2016

Hi

1) jepp. There will be some overhead. Since there are not millions of buckets I don't think this will be an issue...

2) Every indexer will try to copy "it's own" bucket to HDFS. If there is already a valid copy of this bucket the indexer will skip this.
There should be no duplicates.

HTH,

Holger

Hunk bucket archive question?

Archived Metrics Now Available for APAC and EMEA realms

Detecting Remote Code Executions With the Splunk Threat Research Team

Enter the Dashboard Challenge and Watch the .conf24 Global Broadcast!