Splunk Search

How can I get Splunk to index a file containing CICS output (ebcdic)

rune_hellem
Contributor

In our WebSphere environment we successfully indexes all SystemOut and SystemErr.log files except for one single cluster and its members. Problem is that one of the applications logs to SystemOut output from CICS which I have been told is encoded using ebcdic encoding. Therefore Splunk rejects the file with the following messages

 02-12-2014 10:46:26.505 +0100 INFO  TailingProcessor - Ignoring file 'E:\logs\MyCluster\SystemOut.log' due to: binary
 02-12-2014 10:46:26.505 +0100 WARN  FileClassifierManager - The file 'E:\logs\MyCluster\SystemOut.log' is invalid. Reason: binary

For the deployment app defining this file I have created a props.conf file in the folder

D:\Splunk\etc\deployment-apps\inputs_prod\default

I have tried all below without success

[source::E:\\logs\\MyCluster\\SystemOut.log]
CHARSET = utf-ebcdic
#CHARSET = auto
#NO_BINARY_CHECK = 1

First
I am not totally sure that the location of the props.conf is correct, but I do believe so.

Secondly
Without really diving into the details and changing the application and how it logs, is it possible to configure Splunk to index the file?

Tags (2)
0 Karma
1 Solution

wpreston
Motivator

To my knowledge, Splunk cannot index a binary file, however the data from the file can be indexed once it is in a non binary format. There are two approaches you could take:

  1. You could write a scripted input to get the data into Splunk. Your script would essentially read the binary log file, extract the data from it, and put it into a text format readable by Splunk. See the docs on setting up scripted inputs here and here.
  2. Try the method in this blog post.

View solution in original post

Damien_Dallimor
Ultra Champion

I've done a bit of EBCDIC in my time 🙂

You will need to decode the EBCDIC and encode in ASCII.

You might do this in a scripted input or modular input or pre-process the EBCDIC content before sending to Splunk.

The decoding is trivial in python :

ebcdic_str = '\xc8\xc5\xd3\xd3\xd6'    
print ebcdic_str.decode('EBCDIC-CP-BE').encode('ascii')
#prints out HELLO

wpreston
Motivator

To my knowledge, Splunk cannot index a binary file, however the data from the file can be indexed once it is in a non binary format. There are two approaches you could take:

  1. You could write a scripted input to get the data into Splunk. Your script would essentially read the binary log file, extract the data from it, and put it into a text format readable by Splunk. See the docs on setting up scripted inputs here and here.
  2. Try the method in this blog post.
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...