Splunk Search

How to apply regex rules in props.conf and transforms.conf to filter unstructured data before indexing it in Splunk?

prachisaxena
Explorer

The requirement is a multilevel filter
1. I need to create a line break at Header|521|02|00|521| which I am doing using props.conf

props.conf

BREAK_ONLY_BEFORE = Header\|\S*\|\S*\|\S*\|521\|
  1. I need to extract a number of fields using transforms.conf
    transforms.conf

    REGEX = (?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|]*)

    DEST_KEY = _raw
    FORMAT = $1,$7

  2. I Also need to filter the event with a specific value in field such as f7=SCL
    Log file looks like as below

****log file **
512 521 1054 14447916361 SCL@YOK 384 P 2 10GNS@GOC Header|521|02|00|521||SCL@YOK||scl11adm|TYO|NRT|2015-10-14 12:00:33+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 3235897206|
Detail YOK|AHG|SYD|SSE|2015-10-14 11:59:00+09:00|YA4VC|P|P|82.000|0.000||
Reference F7P43||||1|I|
PieceDetail JD014600000733002464|82.0|||178.6|||58.0|||110.0|||140.0||||WPX||||
ExtraCharge YW|JP||0.000||JPY|FOCJPBBX||2015-10-14 11:59:00+09:00||I|
Document|3235897206||FCA||||||||||
DocumentLine 3235897206||1|||||||JP|P|1||0.000||BREAK BULK EXPRESS|KGS.|AUD||
512 15206781 14447916361 SCL@TYO 384 P 2 10GNS2@GOC Header|15206|02|00|521||SCL@TYO||scl11adm|TYO|---|2015-10-14 12:00:36+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 9929275941|
Detail TYO||LBA|SHF|2015-10-14 10:59:00+09:00|NEW0|D|D|0.50|0.40||K|A|DOCUMENT|DOX|0.000|
PieceDetail JD014600002447636977|0.5||||||1.0|||48.0|||39.0||||||||
512 518 246 14447915821 GOP@PEK 384 PKUL 2 10GNS2@GOC

Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||
EventCommon P|JD014600001332139235|||2015-10-14 10:59:00+08:00|PEK|WOC|PEK|PEK|000001|OK||<lhj>|d|
EventSpecific 7329|WOZA|A|||<lhj>|
512 518 246 14447915871 GOP@PEK 384 PKUL 2 10GNS2@GOC Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||
EventCommon P|JD013059718270005069|||2015-10-14 10:59:00+08:00|PEK|WOC|PEK|PEK|000001|OK||<lhj>|d|
EventSpecific7329|WOZA|A|||<lhj>|
512 518 246 14447915931 GOP@PEK 384 PKUL 2 10GNS2@GOC

Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||

0 Karma

jplumsdaine22
Influencer

An excellent place to test regular expressions is https://regex101.com/

Header\S+521|\s should be enough to break your event.

As for the field extraction, can you give some examples of what your trying to extract? The regex you've posted looks like it will match every character

0 Karma

prachisaxena
Explorer

Thank you very much for helping.
The log file is pipe delimited ( although not completely). I have created regex to extract all the fields delimited by pipe. After this using the FORMAT statement, i am extracting only the required text from REGEX lets say $1 and $7 ( or f1 and f7). After this i need to only retain the lines where f7=SCL

512 15206781 14447916361 SCL@TYO 384 P 2 10GNS2@GOC Header|15206|02|00|521||SCL@TYO||scl11adm|TYO|---|2015-10-14 12:00:36+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 9929275941|

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...