Re: Does Hunk create an index?

techdiverdown · ‎05-27-2014

I have created a virtual index with CDH5 and Hunk 6.1. A simple query like the following:

index=tomnetflow destination_address="71.214.56.38"

runs about 28 minutes on our small, 5 node lab cluster with 32GB of memory and 3 HDFS nodes. There is some 35 million netflow records. My question is twofold:

1) When i run the same query over and over, the performance is very linear. Would I expect an index to be created somewhere so subsequent queries run faster?

2) In terms of improving Splunk/Hunk/Hadoop performance, if I segregate the data into directories in HDFS based on date for example (2014-05-26, 2014-05-27) will performance increase (provided i narrow my search to last 24 hours for example)?

Thank You.

nhaddadkaveh_sp · ‎05-27-2014

If you want to run the same query over and over it is better to make it a saved search and accelerate that in Hunk 6.1

Ledion_Bitincka · ‎05-27-2014

Just to clarify, Hunk does not create an index based on the data that it searches once. Yes, we do recommend that you partition your data based on time and any other fields that you search frequently.

rdagan_splunk · ‎05-27-2014

1) In order for you to create a MR job, you will need to change your Splunk query:
From this - index=tomnetflow destination_address="71.214.56.38"
To something like this - index=tomnetflow destination_address="71.214.56.38" | top destination_address

In addition, make sure that you are in ' smart mode ' and not in ' verbose mode '

2) Hunk uses VIX = Virtual Index. Therefore, the index itself is not created and performance will not be any faster.

3) To make sure Hunk runs faster - Make sure you run MR Jobs (see answer to #1), Make sure you use VIX with REGEX that will extract the time from the file name or the HDFS directory name (as you mentioned - that will allow Hunk to bring less data per MR job), If you use Report Acceleration that will Cache the results.

Does Hunk create an index?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future

Join the Conversation