Solved: Field delimitation using character position

changux · ‎09-20-2016

Hi all.

I have some log files like this:

265964455 00000000000000028000000002Fuerza      R              1     0000000100

Field delimitation rules are (in this case, i have similar logs with other character distribution):

First 10 characters or 265964455 = FIELD1 
Next 18 characters or 000000000000000280 = FIELD2
Next 8 characters or 00000002 = FIELD3
Next 12 charcaters or Fuerza    = FIELD4
...

I tried with:

sourcetype=rsrs | rex field=MultiField "(?<FIELD1>.{10}) (?<FIELD2>.{18}) (?<FIELD3>.{8}) (?<FIELD4>.{12}) ..."

But didn't work. Anyone can help me please?

sundareshr · ‎09-20-2016

Try this

sourcetype=rsrs | rex  "(?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*"

View solution in original post

changux · ‎09-21-2016

Last question, how capture the 6 digits/numbers/letters/spaces or whatever.

I tried using rex with \S, \V and \X but doesn't work.

Data looks like:

https://www.dropbox.com/s/t5lg8tkzt1prje0/sample.txt?raw=1

And my rex expression:

... | rex "(?<DATA_ID_NRO_DATA>\S{1,10})\s* ..."

One of my problems is with the DATA_LUG_EXPfield that must be return empty (character positions: 141-147 in this line are empty) and returns the value of the next field DATA_ID_COD_RES, but in general, the data is very complex to extract it perfectly 😞

changux · ‎09-21-2016

@sundareshr any idea to solve this? The data string has spaces, big text sequences and well, i don't know how proceed.

Thanks!

sundareshr · ‎09-21-2016

Not sure I understand. In the data example you posted, what would the value for DATA_LUG_EXP be if extracted correctly

changux · ‎09-22-2016

No, in the position 141 you doesn't have data, so, the field must be null. I can publish a small dataset to explain in a better way, ok?

sundareshr · ‎09-20-2016

Try this

sourcetype=rsrs | rex  "(?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*"

changux · ‎09-21-2016

Works great!
Thanks!
How i can do permanent it? Sorry for disturb.

aakwah · ‎09-21-2016

Hello,

In props.conf add the following:

[rsrs]
REPORT-extraction = field_extraction

transforms.conf

[field_extraction]
REGEX = (?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*

Regards

changux · ‎09-21-2016

Thank you!

inventsekar · ‎09-21-2016

permanent it meaning? learning rex?

changux · ‎09-21-2016

Thanks. I mean, putting correctly in props.conf. Do you can please help me with the stanza? I don't know if i need transforms.conf also.

sundareshr · ‎09-21-2016

You can use this regex in the Field Extraction UI (IFX) OR Add this to your props under the appropriate sourcetype stanza

EXTRACT-fields = (?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})

changux · ‎09-21-2016

Thanks a lot. If some field contains symbols, for example -, how i must catch?

Data:

000000000001614636IObser4AI-TP

Command:

 | rex "(?<FIELD1>\w{1,10})\s*(?<FIELD2>\w{1,18})\s*(?<FIELD3>\w{1,6})\s*(?<FIELD4>\w{1,6})\s*"

FIELD4 returns 4AI and should be 4AI-TP. The symbol - is a non word character and the expression stops.

changux · ‎09-21-2016

Self answered, \S{w1,6} works! Thanks again!

Field delimitation using character position

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Best Practices: Splunk auto adjust pipeline queue

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability

Join the Conversation