All Apps and Add-ons

Hunk and virtual indexes

jwalzerpitt
Influencer

We are using Hunk in a POC and the way our HDFS file structure is set up is we have a folder for every date, so for example our firewall logs are set up like:

/logs/fwsm (parent dir)
--/2015-11-06
--/2015-11-05
--/2015-11-04

--/2015-10-31

We set up a main virtual index at the parent so we’re searching all logs under /logs/fwsm. An issue we’re running into is there is a need to search per day so I find myself creating a virtual index for every date, and with that I had two questions

• Is there any other way to search by date using the virtual indexes?
• Is there any limit to the amount of virtual indexes that can be created (as one can imagine, this will get real ugly when we start creating virtual indexes by date for multiple sourcetypes)?

Thx

Tags (3)
0 Karma
1 Solution

rdagan_splunk
Splunk Employee
Splunk Employee

Have you tried to use the Time Capturing Regex as shown in this document?
http://docs.splunk.com/Documentation/Hunk/latest/Hunk/Addavirtualindex

View solution in original post

jwalzerpitt
Influencer

Working with Splunk Support, the solution was to change the 'Time Range' setting under the Time section to 1 day. Once this change was applied, the date/time picker worked.

Thx for everyone's feedback and help

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Have you tried to use the Time Capturing Regex as shown in this document?
http://docs.splunk.com/Documentation/Hunk/latest/Hunk/Addavirtualindex

jwalzerpitt
Influencer

I did not see that option/document - I assume the time capturing regex means I'd be able to search by date/time within the main virtual index? Am I basing the regex on the file structure, or the log's date/time format?

Thx

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

The option to capture the Regex is part of the Virtual Index UI, Select the Customize Timestamp Format button.
Your assumption is correct, once you set it up you can use the search and the search time picker to select a specific day within the HDFS data.
Here is an example:
path = /logs/fwsm/...
accept =
regex = .?/fwsm/(\d+)-(\d+)-(\d+)/.
format = yyyyMMdd

0 Karma

jwalzerpitt
Influencer

Thx

Apologies as the actual dir structure is /LogCentral/Firewall, so I set my 'Time capturing regex' as follows - ?/Firewall/(d+)-(d+)-(d+)/. (leaving Time Format, Time Adjustment, and Time Zone untouched), but when I run a query - index=fwsm - using the Date/Time picker (I'm selecting Date Range|Before 11/3/2015), I'm getting 'No results found'

Thx

0 Karma

kschon_splunk
Splunk Employee
Splunk Employee

First, make certain there is a '.' char in front of your leading '?' char. (I realize that may just be a typo.)

Also, try setting the format to yyyyMMdd.

jwalzerpitt
Influencer

I added the '.' in front of the time leading '?', and added yyyyMMdd to the time format and it worked! I can't thank you enough!!

Greatly appreciated!

0 Karma

kschon_splunk
Splunk Employee
Splunk Employee

Happy it worked!

0 Karma

jwalzerpitt
Influencer

Was hoping to revisit this issue if possible as I'm seeing some weirdness with the time regex.

We have three directories on HDFS:

• /LogCentral/Firewall
• /LogCentral/ISE
• /LogCentral/ WindowsEvent

I have the following regex applied to our Firewall virtual index and I can use the time picker no problem (slightly modified from the original recommendation):

.?/Firewall/(d+)-(d+)-(d+)/.?)

However, applying the same format to the other two logs (below) I get no events at all no matter what dates I select in the time picker, yet I'm using the same format.

.?/ISE/(d+)-(d+)-(d+)/.?)
.?/WindowsEvent/(d+)-(d+)-(d+)/.?)

Tried the following regex and got a match on regex101.com:

.+ISE/(d+)-(d+)-(d+)

Yet when I enter that and try and run a search, it errors out:

[cdhprovider] Error while running external process, return_code=255. See search.log for more info
[cdhprovider] IOException - No input paths specified in job.

Thx

0 Karma

kschon_splunk
Splunk Employee
Splunk Employee

Yes, this will allow you to efficiently search by time within a single virtual index. The capturing regex will allow Hunk to choose which files to search based on the directories they are in, so it should match that, not the log structure.

0 Karma

jwalzerpitt
Influencer

Thx for the reply and info

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...