Splunk Search

Field delimitation using character position

Builder

Hi all.

I have some log files like this:

265964455 00000000000000028000000002Fuerza      R              1     0000000100

Field delimitation rules are (in this case, i have similar logs with other character distribution):

First 10 characters or 265964455 = FIELD1 
Next 18 characters or 000000000000000280 = FIELD2
Next 8 characters or 00000002 = FIELD3
Next 12 charcaters or Fuerza    = FIELD4
...

I tried with:

sourcetype=rsrs | rex field=MultiField "(?<FIELD1>.{10}) (?<FIELD2>.{18}) (?<FIELD3>.{8}) (?<FIELD4>.{12}) ..."

But didn't work. Anyone can help me please?

0 Karma
1 Solution

Legend

Try this

sourcetype=rsrs | rex  "(?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*" 

View solution in original post

Builder

Last question, how capture the 6 digits/numbers/letters/spaces or whatever.

I tried using rex with \S, \V and \X but doesn't work.

Data looks like:

https://www.dropbox.com/s/t5lg8tkzt1prje0/sample.txt?raw=1

And my rex expression:

... | rex "(?<DATA_ID_NRO_DATA>\S{1,10})\s* ..."

One of my problems is with the DATA_LUG_EXPfield that must be return empty (character positions: 141-147 in this line are empty) and returns the value of the next field DATA_ID_COD_RES, but in general, the data is very complex to extract it perfectly 😞

0 Karma

Builder

@sundareshr any idea to solve this? The data string has spaces, big text sequences and well, i don't know how proceed.

Thanks!

0 Karma

Legend

Not sure I understand. In the data example you posted, what would the value for DATALUGEXP be if extracted correctly

Builder

No, in the position 141 you doesn't have data, so, the field must be null. I can publish a small dataset to explain in a better way, ok?

0 Karma

Legend

Try this

sourcetype=rsrs | rex  "(?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*" 

View solution in original post

Builder

Works great!
Thanks!
How i can do permanent it? Sorry for disturb.

0 Karma

Builder

Hello,

In props.conf add the following:

[rsrs]
REPORT-extraction = field_extraction

transforms.conf

[field_extraction]
REGEX = (?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})\s*

Regards

Builder

Thank you!

0 Karma

Champion

permanent it meaning? learning rex?

0 Karma

Builder

Thanks. I mean, putting correctly in props.conf. Do you can please help me with the stanza? I don't know if i need transforms.conf also.

0 Karma

Legend

You can use this regex in the Field Extraction UI (IFX) OR Add this to your props under the appropriate sourcetype stanza

EXTRACT-fields = (?<field1>\w{1,10})\s*(?<field2>\w{1,18})\s*(?<field3>\d{1,8})\s*(?<field4>\w{1,12})

Builder

Thanks a lot. If some field contains symbols, for example -, how i must catch?

Data:

000000000001614636IObser4AI-TP 

Command:

 | rex "(?<FIELD1>\w{1,10})\s*(?<FIELD2>\w{1,18})\s*(?<FIELD3>\w{1,6})\s*(?<FIELD4>\w{1,6})\s*"

FIELD4 returns 4AI and should be 4AI-TP. The symbol - is a non word character and the expression stops.

0 Karma

Builder

Self answered, \S{w1,6} works! Thanks again!

0 Karma