I'm trying to set a custom archive processor. Is this still supported in Splunk 4.1?
The documentation is contradictory. From props.conf.spec, the 2 parameters which both need to be set are
unarchive_cmd. The descriptions say
invalid_cause can only be set for a sourcetype stanza, whereas
unarchive_cmd can only be set for a source stanza. Is that even possible?
invalid_cause = <string> * Can only be set for a [<sourcetype>] stanza. * Splunk does not index any data with invalid_cause set. * Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd). * Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug mode. * Defaults to empty. is_valid = true | false * Automatically set by invalid_cause. * DO NOT SET THIS. * Defaults to true. unarchive_cmd = <string> * Only called if invalid_cause is set to "archive". This field is only valid on [source::stanzas]. * <string> specifies the shell command to run to extract an archived source. * Must be a shell command that takes input on stdin and produces output on stdout. * Use _auto for Splunk's automatic handling of archive files (tar, tar.gz, tgz, tbz, tbz2, zip) * Defaults to empty.
I can't get the archive processor to activate. Has anyone does this successfully?
Seems to be an old post but for those who are looking for it.. The purpose was to read some binary logs using archive processor. This configuration worked:
props.conf: [source::/path/to/log/directories/...log] invalid_cause = archive unarchive_cmd = executable_to_read_binary sourcetype = binary_log NO_BINARY_CHECK = true [default] maxDist = 500 inputs.conf: [monitor:///path/to/log/directories] sourcetype = binary_log
not sure sourcetype is mandatory to get this working. I was able to use invalid_cause under source::. Actually this is the only way it works for me.
I looked through
system/default/props.conf and it appears that you simply have to have your source-based stanza point to a custom/bogus sourcetype, which is where you set
invalid_cause = archive.
I think an example may make more sense then the paragraph above.
[source::....(tbz|tbz2)(.\d+)?] unarchive_cmd = _auto sourcetype = preprocess-bzip NO_BINARY_CHECK = true [source::....bz2?(.\d+)?] unarchive_cmd = bzip2 -cd - sourcetype = preprocess-bzip NO_BINARY_CHECK = true [preprocess-bzip] invalid_cause = archive is_valid = False LEARN_MODEL = false
What I don't get is this: What's the need for all the different "preprocess-*" sourcetypes? I mean, why not just create a single
[preprocess-archive] (or something like that) and then just point all the
[source::...*] stuff to a single sourcetype. All of the
preprocess-* sourcetype are identical in the system default file. I don't think you ever see these sourcetypes within splunk, do you?