Splunk Search

How to extract fields from log

I'm trying to extract fields from a log and failing miserably.
In my first attempt I used a props.conf to specify the delimiter and field names:

[ipoz]
FIELD_NAMES = "Priority","Date","Thread","Category","Message"
FIELD_DELIMITER="\t"

That didn't work for some reason, so I tried using props and transforms first by specifying the delimiter/fields again, and later switching to regular expression like so:

<props.conf>
[ipoz]
REPORT-IPOZ=IPOZ-DELIM
<transforms.conf>
[IPOZ-DELIM]
#DELIMS="\t"
#FIELDS="Priority","Date","Thread","Category","Message"
REGEX=(?<Priority>.+)\t(?<Date>.+)\t(?<Thread>.+)\t(?<Category>.+)\t(?<Message>.+)

I have some control over the formatting of the log, so I can change the delimiter, but since the 5th field can contain commas I feel like using tab for a delimiter is the right choice.
Can anyone help with this??? I have confirmed that the log is indeed tab delimited by checking the logger configuration (log4j), pasting the log into Notepad++ and showing characters, and using a regex tester to validate the regex.

Here is the relevant section of inputs.conf

[monitor://<path to the not so very good software company>\ipoz.log]
index=casuite
sourcetype=ipoz
disabled = 0
queue = parsingQueue

Below is a sample of the log I am trying to parse.

ERROR   2019-08-21 10:53:32,386 [0x00001cb4]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)
 INFO   2019-08-21 11:19:32,821 [0x0000222c]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   SponsorInterfaceV1::clientDetach: detach failed [sessionId: 43d70a8e665b51f5363dc44ad1f5537d-5d5c41dd-cb06650-1e2, clienthost: 10.33.52.44:60103]
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   Exception[-704]: session expired
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/SessionManager.cpp:286] class eiam::server::dirobj::Session *__cdecl eiam::server::poz::SessionManager::retrieveSession(const class eiam::core::String &)
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:3433] class eiam::server::dirobj::Session *__cdecl eiam::server::poz::Poz::retrieveSession(const class eiam::core::String &)
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)
 INFO   2019-08-21 11:42:41,448 [0x00000bec]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:09:29,304 [0x000012b0]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:10:15,716 [0x00001150]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:14:52,428 [0x000003b8]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:19:49,863 [0x00001e48]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:24:26,612 [0x00001858]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
INFO    2019-08-21 13:27:33,143 [0x00001698]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
0 Karma
1 Solution

I figured this out. First the delimiter in props.conf file shouldn't be in quotes (seems odd, I expect a string to be in quotes).

Secondly I removed "queue = parsingQueue" from inputs.conf.

<inputs.conf>
[monitor://F:\Program Files\CA\SC\EmbeddedEntitlementsManager\logs\ipoz.log]
index=casuite
sourcetype=ca_eem_ipoz
disabled=0

<props.conf>
[ca_eem_ipoz]
FIELD_NAMES="Priority","Date","Thread","Category","Message"
FIELD_DELIMITER=\t

<transforms.conf>
#empty

View solution in original post

I figured this out. First the delimiter in props.conf file shouldn't be in quotes (seems odd, I expect a string to be in quotes).

Secondly I removed "queue = parsingQueue" from inputs.conf.

<inputs.conf>
[monitor://F:\Program Files\CA\SC\EmbeddedEntitlementsManager\logs\ipoz.log]
index=casuite
sourcetype=ca_eem_ipoz
disabled=0

<props.conf>
[ca_eem_ipoz]
FIELD_NAMES="Priority","Date","Thread","Category","Message"
FIELD_DELIMITER=\t

<transforms.conf>
#empty

View solution in original post

SplunkTrust
SplunkTrust

I believe you are missing something in the REGEX on your second attempt.

Try it like this:

props.conf

[ipoz]
REPORT-IPOZ=IPOZ-DELIM

transforms.conf

[IPOZ-DELIM]
REGEX=(?<Priority>[^\t\s]+)[\t\s]+(?<Date>[^\t\s]+)[\t\s]+(?<Thread>[^\t\s]+)[\t\s]+(?<Category>[^\t\s]+)[\t\s]+(?<Message>.+)

You can validate the regex here: https://regex101.com/r/dcaJTn/1

------------
Hope I was able to help you. If so, an upvote would be appreciated.
0 Karma

Champion

you should be able to extract this using regex , try tinkering wih your conf files as a last resort in this case

| makeresults
| eval Description="ERROR    2019-08-21 10:53:32,386    [0x00001cb4]    [eiam.server.ipoz.sponsorinterfacev1]    [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)"
| rex field=Description "(?<pri>.*?)\s+"| rex field=Description "\s+(?<date>.*?)\s+\[" | rex field=Description "\[+(?<thread>.*?)\]"| rex field=Description "\]\s+\[+(?<cat>.*?)\]"| rex field=Description ".*\[+(?<msg>.*?)\)"

If this is what you need, there is no need to hard code description, just replace rex field=Description with rex field=_raw

0 Karma

I've got a bunch of similar logs to parse and I was really hoping to break it into fields on the forwarder and not the indexer.

0 Karma

Champion

ok but you are using \t or tab as delmiter, tried with \s or \s+ or just plain space or ' '
https://docs.splunk.com/Documentation/Splunk/7.3.1/Data/Extractfieldsfromfileswithstructureddata

Special value Props.conf representation
form feed \f
space space or ' '
horizontal tab \t or tab
vertical tab \v
whitespace whitespace
none none or \0
file separator fs or \034
group separator gs or \035
record separator rs or \036
unit separator us or \037

0 Karma

Champion

hi @insert_regex_here
Did you try the above options?

0 Karma

I can't use a space because the last field contains spaces. I think tab is most appropriate for my data.

0 Karma

Champion

i am guessing a few fields here of course, particularly your message,category fields

0 Karma