- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our Hadoop data is in sequence file format (Lz4 compression).
I've configured Hunk on a Cloudera Quickstart VM, and pointed a virtual index to an HDFS parent directory of our data. The data is in sequence file format.
It looks like it it's trying to parse out events, it just does it incorrectly. The events come back as HEX gibberish mostly, with a few readable words.
How do I configure Hunk so it knows these are sequence files??
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found the problem -- in the virtual index provider tab, there is a setting for a regex to match sequence files:
vix.splunk.search.recordreader.sequence.regex
Our files didn't match the default setting ".seq$". I changed it to match our files, and now it works.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found the problem -- in the virtual index provider tab, there is a setting for a regex to match sequence files:
vix.splunk.search.recordreader.sequence.regex
Our files didn't match the default setting ".seq$". I changed it to match our files, and now it works.
