Splunk Search

Field extraction using REGEX

meenal901
Communicator

Hi,

I have a flat file of this format:

0229052320112MARGARET CHODKIEWICZ     APT 603-2100 SHEROBEE RD R164I00022B0A2013-01-022013-01-082013-01-0953N54 UNETCH 012013-01-08          9052320112                                                          5201  

I need to capture the first 3 digits as CUST-CODE.

I have written the below in config files:

PROPS.CONF

[source1]
TRANSFORMS-mysource= source1trans
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false

TRANSFORMS.CONF

[source1trans]
DEST_KEY = MetaData:CUST-CD
REGEX = ^\d{3}
FORMAT = DF-SO-CUST-CD::$1

Still, on search the field CUST-CD is not captured. I also tried IFX to extract the fiels, and it is saved.. but the CUST-CD is not visible in the interesting fields.
Please suggest what i missed.

Thanks.

Tags (1)
0 Karma

meenal901
Communicator

Just another question:

Is there a way we can have REGEX when we know the columns are fixed length?
For the below data:

0229052320112MARGARET CHODKIEWICZ APT 603-2100 SHEROBEE RD R164I00022B0A2013-01-022013-01-082013-01-0953N54 UNETCH 012013-01-08 9052320112

I know the data format.. but its difficult to catch patterns. Can and how we specify where each column begins/ends?

0 Karma

kristian_kolb
Ultra Champion

First, unless you are sure that you need to make this extraction at index time, you should not use TRANSFORMS. And I'm not sure that MetaData:CUST-CD is ever a valid DEST_KEY, at least I don't think that this is what you want.

Abandon that line of reasoning and instead, do this in props.conf only.

[source1]
EXTRACT-blah=(?m)^(?<CUST_ID>\d{3})

Hope this helps,

Kristian

kristian_kolb
Ultra Champion

I believe you can do this like

EXTRACT-stuff=^(?\d{3})(?\d{10})(?\S+)\s+(?\S+)\s+ etc etc

It's just a matter of defining your regexes.

meenal901
Communicator

Just another question:

Is there a way we can have REGEX when we know the columns are fixed length?
For the below data:

0229052320112MARGARET CHODKIEWICZ APT 603-2100 SHEROBEE RD R164I00022B0A2013-01-022013-01-082013-01-0953N54 UNETCH 012013-01-08 9052320112

I know the data format.. but its difficult to catch patterns. Can and how we specify where each column begins/ends?

0 Karma

kristian_kolb
Ultra Champion

True, that is a bit silly, but that's the way it works, also, I forgot to even type it in the example above. Fixed that now.

0 Karma

meenal901
Communicator

Thanks Kristian!

Earlier i had tried with EXTRACT in props.conf.. but there were no results hence went ahead with TRANSFORMS.

The main problem here was the CUST-CD i.e. the name of column. It had a "hyphen", which is not accepted by props.conf. I made it CUST_CD in props.conf and it worked. this is what i did:

EXTRACT-CUST_CODE = (?i)(?P\d{3})

0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...