Getting Data In

Decompressing log files from Microsoft Azure Blob

NickSegalle
Loves-to-Learn Everything

I have logs that are stored in Micrsoft Blob Storage which are compressed as .xz files, but they are not named with that extension, they are in the format: kuberenetes-<datetime> ( example: kubernetes-202101310701).  What I'm trying to do is ingest these logs into Splunk using the Microsoft Cloud Services app.  Because these files are compressed, I believe I need to run the unarchive_cmd against it using props.conf, but I'm not sure this is even supported with this app.  I've searched high and low and have not come across any information that supports it.  As a side note, these files are kuberenetes logs coming from SAP CC2V so I do not have any control of how they are written to blob storage, I can only access them after the fact.  When I enable the application the data starts to stream in but it's all gibberish because the files are compressed. 

Here is what I get:

1/31/21
10:32:25.000 AM
 
Geq��)�5xi� ��B�;X�%���Ul���N�ioG�����X��o��47`�RK�Bd�g�x�A���ʪe���a�E�V�����xUS<x�5=�H�R�4��2

 

Below is what I'm trying...

input.conf:

 

 

[mscs_storage_blob://SAP S3 Logs]
disabled = 0
account = SAP S3
blob_list = kubernetes*
blob_mode = append
collection_interval = 3600
container_name = commerce-logging
sourcetype = mscs:storage:blob:k8
index = test

 

 

props.conf:

 

 

[source::...(.*)]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs_storage_blob://SAP S3 Logs]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob:k8]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

 

 

I know the props.conf is not correct or does not need that many stanzas, but I tried adding all of these in an attempt to get it to work as I'm not even sure it's using the props.conf file.  As a side note, if I decompress the file in Azure Blob and then ingest it, it works perfectly.  So the question is, can I use the 'invalid_cause' and 'unarchive_cmd' in the props for Microsoft Cloud Services app?  If this doesn't work I need to come up with another solution, and I'm thinking I can just copy the files locally and then run it through a standard file monitor process and attempt to run the unarchive command there.

Labels (2)
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...