Getting Data In

Splunk Indexing .gz files as compressed/raw data and not the uncompressed version

qhrtaylordresch
Engager

alt text
Attached is an example of the data, I have also extracted the data from the gz files and it was able to import the data fine that way. The stanza for the monitor is

[monitor:///var/akamailogs/prod]
disabled = false
host = Akamai
index = akamaiweblog
sourcetype = access_combined

Am I missing something?

Tags (1)
0 Karma
1 Solution

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

View solution in original post

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

qhrtaylordresch
Engager

Ah forgot about that, thank you.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Watch On Demand the Tech Talk on November 6 at 11AM PT, and empower your SOC to reach new heights! Duration: ...

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...