Splunk Search

How to apply regex rules in props.conf and transforms.conf to filter unstructured data before indexing it in Splunk?

Explorer

The requirement is a multilevel filter
1. I need to create a line break at Header|521|02|00|521| which I am doing using props.conf

props.conf

BREAK_ONLY_BEFORE = Header\|\S*\|\S*\|\S*\|521\|
  1. I need to extract a number of fields using transforms.conf
    transforms.conf

    REGEX = (?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|])|(?P[^|]*)

    DEST_KEY = _raw
    FORMAT = $1,$7

  2. I Also need to filter the event with a specific value in field such as f7=SCL
    Log file looks like as below

****log file **
512 521 1054 14447916361 SCL@YOK 384 P 2 10GNS@GOC Header|521|02|00|521||SCL@YOK||scl11adm|TYO|NRT|2015-10-14 12:00:33+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 3235897206|
Detail YOK|AHG|SYD|SSE|2015-10-14 11:59:00+09:00|YA4VC|P|P|82.000|0.000||
Reference F7P43||||1|I|
PieceDetail JD014600000733002464|82.0|||178.6|||58.0|||110.0|||140.0||||WPX||||
ExtraCharge YW|JP||0.000||JPY|FOCJPBBX||2015-10-14 11:59:00+09:00||I|
Document|3235897206||FCA||||||||||
DocumentLine 3235897206||1|||||||JP|P|1||0.000||BREAK BULK EXPRESS|KGS.|AUD||
512 15206781 14447916361 SCL@TYO 384 P 2 10GNS2@GOC Header|15206|02|00|521||SCL@TYO||scl11adm|TYO|---|2015-10-14 12:00:36+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 9929275941|
Detail TYO||LBA|SHF|2015-10-14 10:59:00+09:00|NEW0|D|D|0.50|0.40||K|A|DOCUMENT|DOX|0.000|
PieceDetail JD014600002447636977|0.5||||||1.0|||48.0|||39.0||||||||
512 518 246 14447915821 GOP@PEK 384 PKUL 2 10GNS2@GOC

Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||
EventCommon P|JD014600001332139235|||2015-10-14 10:59:00+08:00|PEK|WOC|PEK|PEK|000001|OK||<lhj>|d|
EventSpecific 7329|WOZA|A|||<lhj>|
512 518 246 14447915871 GOP@PEK 384 PKUL 2 10GNS2@GOC Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||
EventCommon P|JD013059718270005069|||2015-10-14 10:59:00+08:00|PEK|WOC|PEK|PEK|000001|OK||<lhj>|d|
EventSpecific7329|WOZA|A|||<lhj>|
512 518 246 14447915931 GOP@PEK 384 PKUL 2 10GNS2@GOC

Header||02|00|518||GOP|GOP@PEK||PEK|WOC|2015-10-14 10:59:00+08:00|2015-10-14 10:59:00+08:00||

0 Karma

Influencer

An excellent place to test regular expressions is https://regex101.com/

Header\S+521|\s should be enough to break your event.

As for the field extraction, can you give some examples of what your trying to extract? The regex you've posted looks like it will match every character

0 Karma

Explorer

Thank you very much for helping.
The log file is pipe delimited ( although not completely). I have created regex to extract all the fields delimited by pipe. After this using the FORMAT statement, i am extracting only the required text from REGEX lets say $1 and $7 ( or f1 and f7). After this i need to only retain the lines where f7=SCL

512 15206781 14447916361 SCL@TYO 384 P 2 10GNS2@GOC Header|15206|02|00|521||SCL@TYO||scl11adm|TYO|---|2015-10-14 12:00:36+09:00|2015-10-14 12:00:36+09:00|2015-10-14 11:00:36+08:00|
Identifier 9929275941|

0 Karma