Hi,
I want to extract the fields using transforms.conf from a flat file. The file contains the following fields. There is no field delimiter. I have tried the below regex but not working. pls help me.
Field Length:
ID -10
FIRST_NAME -15
LAST_NAME -12
ADDRESS - 25
PHONE - 10
Example File1:
ID FIRST_NAME LAST_NAME ADDRESS PHONE
0000001 Name1 Lastname1 Address1 1234567890
00000021 name2 Address2 123456
001 name3 Address3
REGEX = (?i)^(?P<ID>.{10})\S(?<FIRST_NAME>.{15})\S(?<LAST_NAME>.{12})\S(?<ADDRESS>.{25})\S(?<PHONE>.{10})
Thanks,
V.S
One problem with your REGEX above is that you state that there is a non-space character between your fields. Perhaps you want to use \s
instead of \S
? Also, if you want to use that regex, I suggest you change it a little and put it in a normal EXTRACT in props.conf only, like so;
props.conf
[your_sourcetype]
EXTRACT-blah = ^(?P<ID>.{10})\s+(?<FIRST_NAME>.{15})\s+(?<LAST_NAME>.{12})\s+(?<ADDRESS>.{25})\s+(?<PHONE>.{10})
This works well for logs like;
1234567890 gregor von klopp 1, high street 1-555-2345
345323 billy bob ben joe buckingham palace, london +44 123123
So as you can see, this approach requires you to have fixed length fields in your log file.
If your fields do NOT contain spaces, and you have a delimiter that is a single space, it could also be done (better) with a REPORT (which uses transforms.conf);
props.conf
[your_sourcetype]
REPORT-blah = extract_blah
transforms.conf
[extract_blah]
DELIMS = " "
FIELDS = ID, FIRST_NAME, LAST_NAME, ADDRESS, PHONE
Alternatively, you can do the same thing by changing the regex and use an ordinary EXTRACT that does not require fixed length fields;
props.conf
[your_sourcetype]
EXTRACT-bleh = ^(?P<ID>\S+)\s+(?<FIRST_NAME>\S+)\s+(?<LAST_NAME>\S+)\s+(?<ADDRESS>\S+)\s+(?<PHONE>\S+)
These two approaches work with events like;
1234123 gregor von_klopp 1,high_street 1-555-2345
23421 billy_bob_ben joe buckingham_palace,london +44123123
Makes sense?
UPDATE:
If you have truly fixed fields, and the field content is truncated if it's above a certain field length, you'd better use the first approach and remove the delimiters from the regex, i.e.
file looks like;
12345678 HM Queen ElizabN/A Buckingham Palace, London+44123123
0987654321Arnold Schwarzenegg25, Fifth avenue, NY +1-555-123
props.conf
EXTRACT-blah = ^(?P<ID>.{10})(?<FIRST_NAME>.{15})(?<LAST_NAME>.{12})(?<ADDRESS>.{25})(?<PHONE>.{10})
One problem with your REGEX above is that you state that there is a non-space character between your fields. Perhaps you want to use \s
instead of \S
? Also, if you want to use that regex, I suggest you change it a little and put it in a normal EXTRACT in props.conf only, like so;
props.conf
[your_sourcetype]
EXTRACT-blah = ^(?P<ID>.{10})\s+(?<FIRST_NAME>.{15})\s+(?<LAST_NAME>.{12})\s+(?<ADDRESS>.{25})\s+(?<PHONE>.{10})
This works well for logs like;
1234567890 gregor von klopp 1, high street 1-555-2345
345323 billy bob ben joe buckingham palace, london +44 123123
So as you can see, this approach requires you to have fixed length fields in your log file.
If your fields do NOT contain spaces, and you have a delimiter that is a single space, it could also be done (better) with a REPORT (which uses transforms.conf);
props.conf
[your_sourcetype]
REPORT-blah = extract_blah
transforms.conf
[extract_blah]
DELIMS = " "
FIELDS = ID, FIRST_NAME, LAST_NAME, ADDRESS, PHONE
Alternatively, you can do the same thing by changing the regex and use an ordinary EXTRACT that does not require fixed length fields;
props.conf
[your_sourcetype]
EXTRACT-bleh = ^(?P<ID>\S+)\s+(?<FIRST_NAME>\S+)\s+(?<LAST_NAME>\S+)\s+(?<ADDRESS>\S+)\s+(?<PHONE>\S+)
These two approaches work with events like;
1234123 gregor von_klopp 1,high_street 1-555-2345
23421 billy_bob_ben joe buckingham_palace,london +44123123
Makes sense?
UPDATE:
If you have truly fixed fields, and the field content is truncated if it's above a certain field length, you'd better use the first approach and remove the delimiters from the regex, i.e.
file looks like;
12345678 HM Queen ElizabN/A Buckingham Palace, London+44123123
0987654321Arnold Schwarzenegg25, Fifth avenue, NY +1-555-123
props.conf
EXTRACT-blah = ^(?P<ID>.{10})(?<FIRST_NAME>.{15})(?<LAST_NAME>.{12})(?<ADDRESS>.{25})(?<PHONE>.{10})