Getting Data In

Splunk Indexing .gz files as compressed/raw data and not the uncompressed version

qhrtaylordresch
Engager

alt text
Attached is an example of the data, I have also extracted the data from the gz files and it was able to import the data fine that way. The stanza for the monitor is

[monitor:///var/akamailogs/prod]
disabled = false
host = Akamai
index = akamaiweblog
sourcetype = access_combined

Am I missing something?

Tags (1)
0 Karma
1 Solution

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

View solution in original post

n0str0m08
Explorer

Hi,

During input time you have to specify also in props.conf

[access_combined]
invalid_cause = archive

[source::/var/akamailogs/prod]
unarchive_cmd = _auto

According to the Splunk Doc https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf

invalid_cause = <string>
* Can only be set for a [<sourcetype>] stanza.
* If invalid_cause is set, the Tailing code (which handles uncompressed
  logfiles) will not read the data, but hand it off to other components or
  throw an error.
* Set <string> to "archive" to send the file to the archive processor
  (specified in unarchive_cmd).
* When set to "winevt", this causes the file to be handed off to the
  Event Log input processor.
* Set to any other string to throw an error in the splunkd.log if you are
  running Splunklogger in debug mode.
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on
  stdout.
* Use _auto for Splunk software's automatic handling of archive files (tar, 
  tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk 
  software, such as on a forwarder that has configured inputs acquiring the 
  data.
* Defaults to empty.

qhrtaylordresch
Engager

Ah forgot about that, thank you.

0 Karma
Get Updates on the Splunk Community!

Splunk Mobile: Your Brand-New Home Screen

Meet Your New Mobile Hub  Hello Splunk Community!  Staying connected to your data—no matter where you are—is ...

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Real progress on your strategic priorities starts with knowing the business outcomes your teams are delivering ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...