Getting Data In

Decompressing log files from Microsoft Azure Blob

NickSegalle
Explorer

I have logs that are stored in Micrsoft Blob Storage which are compressed as .xz files, but they are not named with that extension, they are in the format: kuberenetes-<datetime> ( example: kubernetes-202101310701).  What I'm trying to do is ingest these logs into Splunk using the Microsoft Cloud Services app.  Because these files are compressed, I believe I need to run the unarchive_cmd against it using props.conf, but I'm not sure this is even supported with this app.  I've searched high and low and have not come across any information that supports it.  As a side note, these files are kuberenetes logs coming from SAP CC2V so I do not have any control of how they are written to blob storage, I can only access them after the fact.  When I enable the application the data starts to stream in but it's all gibberish because the files are compressed. 

Here is what I get:

1/31/21
10:32:25.000 AM
 
Geq��)�5xi� ��B�;X�%���Ul���N�ioG�����X��o��47`�RK�Bd�g�x�A���ʪe���a�E�V�����xUS<x�5=�H�R�4��2

 

Below is what I'm trying...

input.conf:

 

 

[mscs_storage_blob://SAP S3 Logs]
disabled = 0
account = SAP S3
blob_list = kubernetes*
blob_mode = append
collection_interval = 3600
container_name = commerce-logging
sourcetype = mscs:storage:blob:k8
index = test

 

 

props.conf:

 

 

[source::...(.*)]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs_storage_blob://SAP S3 Logs]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob:k8]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

 

 

I know the props.conf is not correct or does not need that many stanzas, but I tried adding all of these in an attempt to get it to work as I'm not even sure it's using the props.conf file.  As a side note, if I decompress the file in Azure Blob and then ingest it, it works perfectly.  So the question is, can I use the 'invalid_cause' and 'unarchive_cmd' in the props for Microsoft Cloud Services app?  If this doesn't work I need to come up with another solution, and I'm thinking I can just copy the files locally and then run it through a standard file monitor process and attempt to run the unarchive command there.

Labels (2)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

Data Management Digest – May 2026

Welcome to the May 2026 edition of Data Management Digest!   As your trusted partner in data innovation, the ...