We are using Hunk in a POC and the way our HDFS file structure is set up is we have a folder for every date, so for example our firewall logs are set up like:
/logs/fwsm (parent dir)
We set up a main virtual index at the parent so we’re searching all logs under /logs/fwsm. An issue we’re running into is there is a need to search per day so I find myself creating a virtual index for every date, and with that I had two questions
• Is there any other way to search by date using the virtual indexes?
• Is there any limit to the amount of virtual indexes that can be created (as one can imagine, this will get real ugly when we start creating virtual indexes by date for multiple sourcetypes)?
Working with Splunk Support, the solution was to change the 'Time Range' setting under the Time section to 1 day. Once this change was applied, the date/time picker worked.
Thx for everyone's feedback and help
I did not see that option/document - I assume the time capturing regex means I'd be able to search by date/time within the main virtual index? Am I basing the regex on the file structure, or the log's date/time format?
The option to capture the Regex is part of the Virtual Index UI, Select the Customize Timestamp Format button.
Your assumption is correct, once you set it up you can use the search and the search time picker to select a specific day within the HDFS data.
Here is an example:
path = /logs/fwsm/...
regex = .?/fwsm/(\d+)-(\d+)-(\d+)/.
format = yyyyMMdd
Apologies as the actual dir structure is /LogCentral/Firewall, so I set my 'Time capturing regex' as follows - ?/Firewall/(d+)-(d+)-(d+)/. (leaving Time Format, Time Adjustment, and Time Zone untouched), but when I run a query - index=fwsm - using the Date/Time picker (I'm selecting Date Range|Before 11/3/2015), I'm getting 'No results found'
First, make certain there is a '.' char in front of your leading '?' char. (I realize that may just be a typo.)
Also, try setting the format to yyyyMMdd.
Was hoping to revisit this issue if possible as I'm seeing some weirdness with the time regex.
We have three directories on HDFS:
• /LogCentral/ WindowsEvent
I have the following regex applied to our Firewall virtual index and I can use the time picker no problem (slightly modified from the original recommendation):
However, applying the same format to the other two logs (below) I get no events at all no matter what dates I select in the time picker, yet I'm using the same format.
Tried the following regex and got a match on regex101.com:
Yet when I enter that and try and run a search, it errors out:
[cdhprovider] Error while running external process, return_code=255. See search.log for more info
[cdhprovider] IOException - No input paths specified in job.
Yes, this will allow you to efficiently search by time within a single virtual index. The capturing regex will allow Hunk to choose which files to search based on the directories they are in, so it should match that, not the log structure.