- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Creating Hunk 6.1 Virtual Index
Using CDH5 (MR2) and Hunk 6.1 on Centos 6.4...
I have my netflow ascii data in the HDFS file system in 15 minute increments with each day being a higher level directory and each file having 15 minutes of netflow data. Something like this:
/user/netflow/2015-05-25/asciiflow2014-05-25-02-45-01.csv
/user/netflow/2015-05-25/asciiflow2014-05-25-03-00-01.csv
..
..
/user/netflow/2015-05-26/asciiflow2014-05-26-02-45-01.csv
..
Given this I am wondering about the virtual index configuration I have, listed below, is correct?
I seem to search the same amount of time no mater what the time period is....
Time Capturing Regex is "/user/netflow/(\d+)-(\d+)-(\d+)/"
Time Format is "yyyyMMdd"
Time Adjustment is 15 Minutes??
Time Range is 1 day ??
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
data:image/s3,"s3://crabby-images/1f594/1f594b1b4c0941863df1722dd52dd06a5b9a2e11" alt="Splunk Employee Splunk Employee"
You can either extract the time range from the parent dir:
Time Capturing Regex: "/user/netflow/(d+)-(d+)-(d+)/"
Time Format: "yyyyMMdd"
Time Adjustment: 0
Time Range: 1 day
or your can extract the more granular timestamp at the file level:
Time Capturing Regex: "asciiflow(\d+)-(\d+)-(\d+)-(\d+)-(\d+)-\d+.csv$"
Time Format: "yyyyMMddHHmm"
Time Adjustment: 0
Time Range: 15 minutes
data:image/s3,"s3://crabby-images/63b2f/63b2fe586cbbf67f7ba1d1e6a80413550245b7cf" alt=""