Basic problem: in smart mode my fields are not getting extracted. All works in verbose mode. Also the time searching does work so I know that how I specify the time field does work.
Search that fails: index=foo | stats count by hii (or any field that isn't partitioned)
I have looked at the previous questions on Hunk extractions and smart mode (e.g. https://answers.splunk.com/answers/147879/why-hunks-field-extractor-behaves-differently-in-smart-mode-vs-fast-mode.html) but I cannot get mine to work.
we are using log files generated by spark: they are snappy compressed with the name ... snappy.orc
there is no metastore so I provide a fake database and table to make Splunk happy
i specify the exact fields and their types
I tried making all the fields or some of the fields required per Leon B's posts but that didn't help
I have the snappy jar on the THIRD_PARTY_JARS and Splunk is able to decompress the orc files
indexes.conf
vix.input.1.splitter.hive.fileformat = orc
vix.input.1.splitter.hive.columnnames = cqtq, ttms, chi, crc, pssc, psql, cqhm, cquc, caun, phr, psct, cquuc, cqtr, cqssl, cqssr, pitag, sstc, psqql, ttsfb,ttrq, cqbl, pttsfb, tfstoc, sscl, UA, tsso, sscc, phi, chp, Carpcqh, sssc, cqssv, cqssc, hii
vix.input.1.splitter.hive.columntypes = string:int:string:string:int:bigint:string:string:string:string:string,string,int:int:int:string:int:int:int:int:bigint:int:bigint:bigint:string,int:int:string:int:string:string:string:string:string
vix.input.1.required_fields = cqtq,ttms,UA,hii
# Completely made up values to satisfy Splunk
vix.input.1.splitter.hive.tablename = transfered
vix.input.1.splitter.hive.dbname = default
in my provider i have vix.splunk.search.splitter = HiveSplitGenerator
props.conf
[source::/projects/flickr/flopsa/ycpi_spark/orc/...]
priority = 202
sourcetype = foo
NO_BINARY_CHECK = true
[foo]
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_PREFIX = cqtq\":
TIME_FORMAT = %s.%3N
(Note I also tried these two which also get the time search to work but still not fields)
eval-_time=strptime('cqtq',"%s.%3N")
EXTRACT-_time=strptime('cqtq',"%s.%3N")
... View more