All Apps and Add-ons

Splunk Analytics for Hadoop: Why is Hunk searching all of the HDFS files instead of restricting it to the selected the time range?

EricLloyd79
Builder

We are new to Hunk (or now called Splunk Analytics for Hadoop).
I am attempting to run a query on our HDFS directories for the last 5 mins.
Here is the query: index=foo | sort 0 _time
So just return all the entries from the last 5 mins in the index foo sorted without truncation.

But it searches through all 8 million + events in our HDFS directories even after it seems to have found the complete list for the last 5 mins.

Any reasons why it might be doing this?

0 Karma
1 Solution

kschon_splunk
Splunk Employee
Splunk Employee

It sounds like an issue with your "et" (earliest time) configurations. When you give a search a time range, Splunk Analytics for Hadoop (formerly called Hunk) decides whether to read a particular file on HDFS based on the earliest and latest times for that file, as read from it's path. (It may also skip files based on other field values, if you have configured other path field extractions.) The relevant configurations for your virtual index are:

vix.input.1.et.regex
vix.input.1.et.format
vix.input.1.et.offset
vix.input.1.lt.regex
vix.input.1.lt.format
vix.input.1.lt.offset

You can get more information about these properties here:
http://docs.splunk.com/Documentation/Splunk/6.5.1/Admin/Indexesconf

If you've already set these props and you don't know what's going wrong, please post the provider and vix stanzas for this vix from your indexes.conf file, and an example HDFS file path, after anonymizing any confidential portions.

View solution in original post

Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...