Splunk Search

How can I dynamically split my sample data using regex or any other options are available?

Shan
Builder

I have data in a log file as mentioned below. Can I split it using regex or any other options are available?

0010213002040538

I want to split the data above like this:

001 02 13 
002 04 0538 

For example, we can take:

001 02 13 

001 is a transaction code
02 is length of next value's value
13 is the value

Based on the length, I need to split the value dynamically.

So, how can I dynamically write the rex search to split it? If "02" appears as the length, I need to use that length and split the next value "13".
If the length is "04" then, I need to split based on the length to get "0538".

Thanks in advance
Kindly help me.

Tags (2)
0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

This cannot be currently done. The regular expressions won't ever match properly, and using .* gets way to much data to be useful. The only fix here is to edit the source of the data (or perform prior processing with a script) to sed the data correctly.

Here is a sample bash script that will separate out the portions you need.

#!/bin/bash
data="001021300204053800309123d5-78900404data00503get"
myIndex=0
while [ $myIndex -lt ${#data} ]
do
  txnid=${data:$myIndex:3}
  myIndex=$[$myIndex+3]
  txnlen=`echo ${data:$myIndex:2}|sed 's/^0*//'`
  myIndex=$[$myIndex+2]
  txnstr=${data:$myIndex:$txnlen}
  myIndex=$[$myIndex+$txnlen]
  echo "txnid=$txnid txnlen=$txnlen txnstr=\"$txnstr\" "
done

This can be setup as a scripted input (passing in the correct values for data from command line) or by running it on the logs on the server, placing the output into a new location, and using the forwarder on the new logs with proper parsing. Then this is consumed and search like:

<your_scripted_input> | table txnid txnlen txnstr

woodcock
Esteemed Legend

Not with a single rex but with this chain of commands:

 ... | rex "(?<TransactionCode>.{3})(?<FieldValueLen>.{2})(?<FieldValue>.*)" | eval FieldValue=substr(FieldValue,1,FieldValueLen)
0 Karma

Shan
Builder

Woodcock,

First of all. Thank you very much for your valuable reply.
When I use the above rex search, it's splitting the first value and stopped there itself. How can I make use of the same rex for multiple value separation?

Sample data:

001021300204053800309123d5-78900404data00503get

Current Search:

sourcetype=testrex | table * | rex field=_raw "(?&lt;TransactionCode&gt;.{3})(?<FieldValueLen>.{2})(?<FieldValue>.&#42;)" | eval FieldValue=substr(FieldValue,1,FieldValueLen) | table TransactionCode FieldValueLen FieldValue

Desired Result:

001 02 13
002 04 0538
003 09 123d5-789
004 04 data
005 03 get

Current Result:

TransactionCode FieldValueLen FieldValue
001 02 13
001 02 13

Regards,
Shankar

0 Karma

woodcock
Esteemed Legend

Hopefully you have a limited chain otherwise an iterative approach like mine won't work. Let's assume you can have at most 4 in a chain; this should work:

... | rex "(?<TransactionCode>.{3})(?<FieldValueLen>.{2})(?<TempFieldValue>.*)"
| eval FieldValue=substr(TempFieldValue,1,FieldValueLen)
| eval TempFieldValue=substr(TempFieldValue,1+FieldValueLen)
| eval subevent=TransactionCode . ":::" . FieldValueLen . ":::" . FieldValue
| rex "(?<TempTransactionCode>.{3})(?<TempFieldValueLen>.{2})(?<TempFieldValue>.*)"
| eval TransactionCode=mvappend(TransactionCode, TempTransactionCode)
| eval FieldValueLen=mvappend(FieldValueLen, TempFieldValueLen)
| eval FieldValue=mvppend(FieldValue, substr(TempFieldValue,1,TempFieldValueLen)
| eval TempFieldValue=substr(TempFieldValue,1+TempFieldValueLen)
| eval subevent=mvappend(subevent, TempTransactionCode . ":::" . TempFieldValueLen . ":::" . TempFieldValue)
| rex "(?<TempTransactionCode>.{3})(?<TempFieldValueLen>.{2})(?<TempFieldValue>.*)"
| eval TransactionCode=mvappend(TransactionCode, TempTransactionCode)
| eval FieldValueLen=mvappend(FieldValueLen, TempFieldValueLen)
| eval FieldValue=mvppend(FieldValue, substr(TempFieldValue,1,TempFieldValueLen)
| eval TempFieldValue=substr(TempFieldValue,1+TempFieldValueLen)
| eval subevent=mvappend(subevent, TempTransactionCode . ":::" . TempFieldValueLen . ":::" . TempFieldValue)
| rex "(?<TempTransactionCode>.{3})(?<TempFieldValueLen>.{2})(?<TempFieldValue>.*)"
| eval TransactionCode=mvappend(TransactionCode, TempTransactionCode)
| eval FieldValueLen=mvappend(FieldValueLen, TempFieldValueLen)
| eval FieldValue=mvppend(FieldValue, substr(TempFieldValue,1,TempFieldValueLen)
| eval subevent=mvappend(subevent, TempTransactionCode . ":::" . TempFieldValueLen . ":::" . TempFieldValue)

Each event has several new multivalued fields and if you need to break out each subevent into a separate event, you add this:

| mvexpand subevent | rex field=subevent "(?<TransactionCode>.*?):::(?<FieldValueLen>.*?):::(?<FieldValue>.*)"  | table TransactionCode FieldValueLen FieldValue
0 Karma

Shan
Builder

Hai Woodcock,

Thank you very much.
I will try it with another sample file.

0 Karma

woodcock
Esteemed Legend

Don't forget to "Accept" the answer to close this question.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...