Splunk Search

Is it possible to define a lookup to act as a Hunk input filter at search-time?

rhinomike
Explorer

Hi there,

I have been testing Hunk and noticed that due to the lack of pre-indexing, it relies quite a lot on proper Regexes and other sorts of filters to speed up searches.

An example of this is the use of vix.input.1.path and vix.input.1.et.* and vix.input.1.lt.* settings as illustrated below:

[hunktest]
vix.input.1.accept = \.gz$
vix.input.1.path = /test/logs/${environmentid}/...
vix.provider = test-hadoop-cluster
vix.input.1.et.format = yyyyMMddHHmmssSSSS
vix.input.1.et.offset = -3600
vix.input.1.et.regex = .*/logs/\d+/data\.(\d+).*
vix.input.1.et.timezone = GMT
vix.input.1.lt.format = yyyyMMddHHmmssSSSS
vix.input.1.lt.offset = 0
vix.input.1.lt.regex = .*/logs/\d+/data\.(\d+).*
vix.input.1.lt.timezone = GMT

While the above works great, I am facing a small complication. ${environmentid} is a numerical value that has very little meaning to the people who would be using the search heads.

I know I can use a lookup and I have configured one:

[preprocess-gzip]
LOOKUP-env_to_ids = environment_name environmentid OUTPUTNEW environment_name

I also tested the lookup and it seems it is working:

When I perform a search like index=hunktest environmentid=123 I can navigate through the matches and see the environment_name field has been created and matches the CSV contents. I can also see that just one subfolder (123) has raised matches.

However, if I try to run index=hunktest environmentname=Test or index=hunktest environmentname="Test", upon inspecting the search.log, it seems like Hunk crawled the whole HDFS store instead of crawling just /logs/123/

Is it possible to define a lookup so that it act as a filter on search time?

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

While lookups help in forward translating the values, when we perform reverse lookups the search gets translated into (environmentid=123 OR environmentname=Test) , which unfortunately means that the search based partition pruning cannot help. We'll take that in as an enhancement request and do some research on how we can solve this problem. In the mean time one workaround that I think of would be using form searches to aid users in picking up an environment (show a user friendly string, but use the id to populate the search)

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...