Getting Data In

Decompressing log files from Microsoft Azure Blob

NickSegalle
Explorer

I have logs that are stored in Micrsoft Blob Storage which are compressed as .xz files, but they are not named with that extension, they are in the format: kuberenetes-<datetime> ( example: kubernetes-202101310701).  What I'm trying to do is ingest these logs into Splunk using the Microsoft Cloud Services app.  Because these files are compressed, I believe I need to run the unarchive_cmd against it using props.conf, but I'm not sure this is even supported with this app.  I've searched high and low and have not come across any information that supports it.  As a side note, these files are kuberenetes logs coming from SAP CC2V so I do not have any control of how they are written to blob storage, I can only access them after the fact.  When I enable the application the data starts to stream in but it's all gibberish because the files are compressed. 

Here is what I get:

1/31/21
10:32:25.000 AM
 
Geq��)�5xi� ��B�;X�%���Ul���N�ioG�����X��o��47`�RK�Bd�g�x�A���ʪe���a�E�V�����xUS<x�5=�H�R�4��2

 

Below is what I'm trying...

input.conf:

 

 

[mscs_storage_blob://SAP S3 Logs]
disabled = 0
account = SAP S3
blob_list = kubernetes*
blob_mode = append
collection_interval = 3600
container_name = commerce-logging
sourcetype = mscs:storage:blob:k8
index = test

 

 

props.conf:

 

 

[source::...(.*)]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs_storage_blob://SAP S3 Logs]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob:k8]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

 

 

I know the props.conf is not correct or does not need that many stanzas, but I tried adding all of these in an attempt to get it to work as I'm not even sure it's using the props.conf file.  As a side note, if I decompress the file in Azure Blob and then ingest it, it works perfectly.  So the question is, can I use the 'invalid_cause' and 'unarchive_cmd' in the props for Microsoft Cloud Services app?  If this doesn't work I need to come up with another solution, and I'm thinking I can just copy the files locally and then run it through a standard file monitor process and attempt to run the unarchive command there.

Labels (2)
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...