Splunk Search

How can I get Splunk to index a file containing CICS output (ebcdic)

rune_hellem
Contributor

In our WebSphere environment we successfully indexes all SystemOut and SystemErr.log files except for one single cluster and its members. Problem is that one of the applications logs to SystemOut output from CICS which I have been told is encoded using ebcdic encoding. Therefore Splunk rejects the file with the following messages

 02-12-2014 10:46:26.505 +0100 INFO  TailingProcessor - Ignoring file 'E:\logs\MyCluster\SystemOut.log' due to: binary
 02-12-2014 10:46:26.505 +0100 WARN  FileClassifierManager - The file 'E:\logs\MyCluster\SystemOut.log' is invalid. Reason: binary

For the deployment app defining this file I have created a props.conf file in the folder

D:\Splunk\etc\deployment-apps\inputs_prod\default

I have tried all below without success

[source::E:\\logs\\MyCluster\\SystemOut.log]
CHARSET = utf-ebcdic
#CHARSET = auto
#NO_BINARY_CHECK = 1

First
I am not totally sure that the location of the props.conf is correct, but I do believe so.

Secondly
Without really diving into the details and changing the application and how it logs, is it possible to configure Splunk to index the file?

Tags (2)
0 Karma
1 Solution

wpreston
Motivator

To my knowledge, Splunk cannot index a binary file, however the data from the file can be indexed once it is in a non binary format. There are two approaches you could take:

  1. You could write a scripted input to get the data into Splunk. Your script would essentially read the binary log file, extract the data from it, and put it into a text format readable by Splunk. See the docs on setting up scripted inputs here and here.
  2. Try the method in this blog post.

View solution in original post

Damien_Dallimor
Ultra Champion

I've done a bit of EBCDIC in my time 🙂

You will need to decode the EBCDIC and encode in ASCII.

You might do this in a scripted input or modular input or pre-process the EBCDIC content before sending to Splunk.

The decoding is trivial in python :

ebcdic_str = '\xc8\xc5\xd3\xd3\xd6'    
print ebcdic_str.decode('EBCDIC-CP-BE').encode('ascii')
#prints out HELLO

wpreston
Motivator

To my knowledge, Splunk cannot index a binary file, however the data from the file can be indexed once it is in a non binary format. There are two approaches you could take:

  1. You could write a scripted input to get the data into Splunk. Your script would essentially read the binary log file, extract the data from it, and put it into a text format readable by Splunk. See the docs on setting up scripted inputs here and here.
  2. Try the method in this blog post.
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...