I'm trying to set a custom archive processor. Is this still supported in Splunk 4.1?
The documentation is contradictory. From props.conf.spec, the 2 parameters which both need to be set are invalid_cause
and unarchive_cmd
. The descriptions say invalid_cause
can only be set for a sourcetype stanza, whereas unarchive_cmd
can only be set for a source stanza. Is that even possible?
invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* Splunk does not index any data with invalid_cause set.
* Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd).
* Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug mode.
* Defaults to empty.
is_valid = true | false
* Automatically set by invalid_cause.
* DO NOT SET THIS.
* Defaults to true.
unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive". This field is only valid on [source::stanzas].
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on stdout.
* Use _auto for Splunk's automatic handling of archive files (tar, tar.gz, tgz, tbz, tbz2, zip)
* Defaults to empty.
I can't get the archive processor to activate. Has anyone does this successfully?
Seems to be an old post but for those who are looking for it.. The purpose was to read some binary logs using archive processor. This configuration worked:
props.conf:
[source::/path/to/log/directories/...log]
invalid_cause = archive
unarchive_cmd = executable_to_read_binary
sourcetype = binary_log
NO_BINARY_CHECK = true
[default]
maxDist = 500
inputs.conf:
[monitor:///path/to/log/directories]
sourcetype = binary_log
not sure sourcetype is mandatory to get this working. I was able to use invalid_cause under source::. Actually this is the only way it works for me.
I looked through system/default/props.conf
and it appears that you simply have to have your source-based stanza point to a custom/bogus sourcetype, which is where you set invalid_cause = archive
.
I think an example may make more sense then the paragraph above.
[source::....(tbz|tbz2)(.\d+)?]
unarchive_cmd = _auto
sourcetype = preprocess-bzip
NO_BINARY_CHECK = true
[source::....bz2?(.\d+)?]
unarchive_cmd = bzip2 -cd -
sourcetype = preprocess-bzip
NO_BINARY_CHECK = true
[preprocess-bzip]
invalid_cause = archive
is_valid = False
LEARN_MODEL = false
What I don't get is this: What's the need for all the different "preprocess-*" sourcetypes? I mean, why not just create a single [preprocess-archive]
(or something like that) and then just point all the [source::...*]
stuff to a single sourcetype. All of the preprocess-*
sourcetype are identical in the system default file. I don't think you ever see these sourcetypes within splunk, do you?