Getting Data In

Decompressing log files from Microsoft Azure Blob

NickSegalle
Loves-to-Learn Everything

I have logs that are stored in Micrsoft Blob Storage which are compressed as .xz files, but they are not named with that extension, they are in the format: kuberenetes-<datetime> ( example: kubernetes-202101310701).  What I'm trying to do is ingest these logs into Splunk using the Microsoft Cloud Services app.  Because these files are compressed, I believe I need to run the unarchive_cmd against it using props.conf, but I'm not sure this is even supported with this app.  I've searched high and low and have not come across any information that supports it.  As a side note, these files are kuberenetes logs coming from SAP CC2V so I do not have any control of how they are written to blob storage, I can only access them after the fact.  When I enable the application the data starts to stream in but it's all gibberish because the files are compressed. 

Here is what I get:

1/31/21
10:32:25.000 AM
 
Geq��)�5xi� ��B�;X�%���Ul���N�ioG�����X��o��47`�RK�Bd�g�x�A���ʪe���a�E�V�����xUS<x�5=�H�R�4��2

 

Below is what I'm trying...

input.conf:

 

 

[mscs_storage_blob://SAP S3 Logs]
disabled = 0
account = SAP S3
blob_list = kubernetes*
blob_mode = append
collection_interval = 3600
container_name = commerce-logging
sourcetype = mscs:storage:blob:k8
index = test

 

 

props.conf:

 

 

[source::...(.*)]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs_storage_blob://SAP S3 Logs]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob:k8]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

 

 

I know the props.conf is not correct or does not need that many stanzas, but I tried adding all of these in an attempt to get it to work as I'm not even sure it's using the props.conf file.  As a side note, if I decompress the file in Azure Blob and then ingest it, it works perfectly.  So the question is, can I use the 'invalid_cause' and 'unarchive_cmd' in the props for Microsoft Cloud Services app?  If this doesn't work I need to come up with another solution, and I'm thinking I can just copy the files locally and then run it through a standard file monitor process and attempt to run the unarchive command there.

Labels (2)
0 Karma
Get Updates on the Splunk Community!

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...