We index a large volume of financial logs, which are in the Financial Information eXchange (FIX) format. These are not really in an easily human readable format as they contain a bunch of numeric codes for fields and values, so I am trying to get Splunk to translate these logs so when my users search for them, they can understand them without having to reference their FIX documentation.
FIX messages contain multiple codes for field names and values in each event, that can be translated like this:
FIXcode,translation 38=,OrderQty= 39=0,OrdStatus=New 39=1,OrdStatus=Partially filled
My FIX people currently manually look up these codes in a reference manual, or run raw log text through a java translation app.
Has anyone managed to get Splunk to take care of this by itself?
(If not, I'm approaching the problem from a particular angle here: http://answers.splunk.com/questions/886/what-is-the-procedure-to-build-your-own-splunk-search-relate...)
I wrote a simple (20 line, and could be shortened) Python script that is referenced by $SPLUNK_HOME/etc/system/local/commands.conf to become a custom search command that can be used in Splunk Web. Any search for FIX logs can now just be piped to "translatefix" for human readable logs.
IMHO it's much simpler than using a large number of lookup tables (one for each of the 1139 possible fields in FIX5.0), and configuring each in props/transforms. The script take a single config file, which is a list of strings to match and replace.
It's also better, as it strips out the unprintable ASCII characters, such as the field separator SOH or \x01, making it much easier to read.
Agreed, the config file will potentially be quite long once all codes are in there (we are just using a "top 500" selection at the moment), but at least it is all contained in the one place, and does not require a Splunk restart (or any Splunk config changes at all) when new fields are added.
If anyone else would find this useful in their environment, please contact me at glenn.sinclair at igindex dot co dot uk
Hi Glenn,
Can you share your python script to me ? We also want to use Splunk to index FIX logs (for securities and futures trading system).
I use props.conf and transforms.conf to extract some fields in FIX logs, but I think your python script is much better and easy to use.
thanks 🙂
Owen Lee
Sure. Send me an email at glenn.sinclair at igindex dot co dot uk so I can get your address, and I'll send you the files. I'll eventually be putting it up on Splunkbase at the request of the Splunk guys here, but haven't had the time to package it as yet.
Glenn, would you mind helping ndoshi by providing some field extraction direction on the question: http://answers.splunk.com/questions/3000/using-delims-to-extract-fix-data/3003#3003
I wrote a simple (20 line, and could be shortened) Python script that is referenced by $SPLUNK_HOME/etc/system/local/commands.conf to become a custom search command that can be used in Splunk Web. Any search for FIX logs can now just be piped to "translatefix" for human readable logs.
IMHO it's much simpler than using a large number of lookup tables (one for each of the 1139 possible fields in FIX5.0), and configuring each in props/transforms. The script take a single config file, which is a list of strings to match and replace.
It's also better, as it strips out the unprintable ASCII characters, such as the field separator SOH or \x01, making it much easier to read.
Agreed, the config file will potentially be quite long once all codes are in there (we are just using a "top 500" selection at the moment), but at least it is all contained in the one place, and does not require a Splunk restart (or any Splunk config changes at all) when new fields are added.
If anyone else would find this useful in their environment, please contact me at glenn.sinclair at igindex dot co dot uk
I'm finding this script a little slow for large results sets. If anyone has any (efficiency, or other) improvements to make (probably wouldn't be hard, as my script is very simple at present), please let me know.
Hey Glenn, found this add-on
Thanks,
Vince Lorenca
thanks Glenn, will have a look at it asap.
actually, i was originally thinking along the same lines as your solution... however, at the moment, I am having an engineer write a python script to extract the field-tag-names directly from the fix dot org website. will let you know how this turns out
Someone has uploaded my translatefix add on to Splunkbase now - http://splunkbase.splunk.com/apps/All/4.x/Add-On/app:Financial+Information+eXchange+%28FIX%29+Log+Pa...
Note that changes to Splunk lookup tables do not require a restart to take effect.
Well, I'm not completely sure, but it looks to me like you can pretty easily use a combination of Splunk lookup tables and field aliases to decode these, search, and report on on them.
You'll need to define set CLEAN_KEYS
to false if you use the Splunk KV_MODE or DELIMS auto field extraction, since the default Splunk KV_MODE only accepts a letter (not number) as the first character of a field name otherwise.
Then, you can alias the field number to names with a series of FIELDALIAS commands. Finally, you can define a series of lookup tables, one for each field type probably. These are just CSV files that map the numeric values to some other value.
All of these functions can be set up in props.conf for your sourcetype so they run automatically whenever data from that sourcetype is displayed. Let us know if you would like more details, but all of these are standard functions that should be described in the docs, the props.conf
and transforms.conf
specification files.
Yes, you would need a different table for each field. But then again, the table is just a CSV file (e.g. that would export from Excel) so is highly portable. A script would, I imagine, have to be equally complex, if it does not make use of a backing store of data that is at least as complicated as a series of CSV tables.
Please correct me if I am wrong in thinking that I need a lookup table for each different field btw.
Also, a python script/search translation function would be more portable, allowing other financial users to use it themselves. It's a common information standard, I'd be surprised if a large number of Splunk users didn't have to deal with it in their environments.
I looked at using a lookup table initially, as I wrongly thought that it would search the sourcetype for any occurrence of the string in the lookup column, and replace with whatever was in the OUTPUT column. I soon realised that I would actually need a lookup table for each of the different fields, as lookups work on values rather than fields/strings in general. So this is effectively what you are saying... but the problem is that FIX5.0 standard has 1139 different fields, meaning I would have to maintain this many lookup tables! And this is too many. See my custom search function question?