Getting Data In

Splunk Indexing .gz files as compressed/raw data and not the uncompressed version

qhrtaylordresch
Engager

alt text
Attached is an example of the data, I have also extracted the data from the gz files and it was able to import the data fine that way. The stanza for the monitor is

[monitor:///var/akamailogs/prod]
disabled = false
host = Akamai
index = akamaiweblog
sourcetype = access_combined

Am I missing something?

Tags (1)
0 Karma
1 Solution

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

View solution in original post

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

qhrtaylordresch
Engager

Ah forgot about that, thank you.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...