Getting Data In

Decompressing log files from Microsoft Azure Blob

NickSegalle
Loves-to-Learn Everything

I have logs that are stored in Micrsoft Blob Storage which are compressed as .xz files, but they are not named with that extension, they are in the format: kuberenetes-<datetime> ( example: kubernetes-202101310701).  What I'm trying to do is ingest these logs into Splunk using the Microsoft Cloud Services app.  Because these files are compressed, I believe I need to run the unarchive_cmd against it using props.conf, but I'm not sure this is even supported with this app.  I've searched high and low and have not come across any information that supports it.  As a side note, these files are kuberenetes logs coming from SAP CC2V so I do not have any control of how they are written to blob storage, I can only access them after the fact.  When I enable the application the data starts to stream in but it's all gibberish because the files are compressed. 

Here is what I get:

1/31/21
10:32:25.000 AM
 
Geq��)�5xi� ��B�;X�%���Ul���N�ioG�����X��o��47`�RK�Bd�g�x�A���ʪe���a�E�V�����xUS<x�5=�H�R�4��2

 

Below is what I'm trying...

input.conf:

 

 

[mscs_storage_blob://SAP S3 Logs]
disabled = 0
account = SAP S3
blob_list = kubernetes*
blob_mode = append
collection_interval = 3600
container_name = commerce-logging
sourcetype = mscs:storage:blob:k8
index = test

 

 

props.conf:

 

 

[source::...(.*)]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs_storage_blob://SAP S3 Logs]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

[mscs:storage:blob:k8]
invalid_cause = archive
unarchive_cmd = /usr/bin/xz -cd -
sourcetype = mscs:storage:blob:k8
KV_MODE = json
NO_BINARY_CHECK = true

 

 

I know the props.conf is not correct or does not need that many stanzas, but I tried adding all of these in an attempt to get it to work as I'm not even sure it's using the props.conf file.  As a side note, if I decompress the file in Azure Blob and then ingest it, it works perfectly.  So the question is, can I use the 'invalid_cause' and 'unarchive_cmd' in the props for Microsoft Cloud Services app?  If this doesn't work I need to come up with another solution, and I'm thinking I can just copy the files locally and then run it through a standard file monitor process and attempt to run the unarchive command there.

Labels (2)
0 Karma
Get Updates on the Splunk Community!

Access Tokens Page - New & Improved

Splunk Observability Cloud recently launched an improved design for the access tokens page for better ...

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

&#x1f342; Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...