Splunk Search

How to extract fields from log

insert_regex_he
Explorer

I'm trying to extract fields from a log and failing miserably.
In my first attempt I used a props.conf to specify the delimiter and field names:

[ipoz]
FIELD_NAMES = "Priority","Date","Thread","Category","Message"
FIELD_DELIMITER="\t"

That didn't work for some reason, so I tried using props and transforms first by specifying the delimiter/fields again, and later switching to regular expression like so:

<props.conf>
[ipoz]
REPORT-IPOZ=IPOZ-DELIM
<transforms.conf>
[IPOZ-DELIM]
#DELIMS="\t"
#FIELDS="Priority","Date","Thread","Category","Message"
REGEX=(?<Priority>.+)\t(?<Date>.+)\t(?<Thread>.+)\t(?<Category>.+)\t(?<Message>.+)

I have some control over the formatting of the log, so I can change the delimiter, but since the 5th field can contain commas I feel like using tab for a delimiter is the right choice.
Can anyone help with this??? I have confirmed that the log is indeed tab delimited by checking the logger configuration (log4j), pasting the log into Notepad++ and showing characters, and using a regex tester to validate the regex.

Here is the relevant section of inputs.conf

[monitor://<path to the not so very good software company>\ipoz.log]
index=casuite
sourcetype=ipoz
disabled = 0
queue = parsingQueue

Below is a sample of the log I am trying to parse.

ERROR   2019-08-21 10:53:32,386 [0x00001cb4]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)
 INFO   2019-08-21 11:19:32,821 [0x0000222c]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   SponsorInterfaceV1::clientDetach: detach failed [sessionId: 43d70a8e665b51f5363dc44ad1f5537d-5d5c41dd-cb06650-1e2, clienthost: 10.33.52.44:60103]
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   Exception[-704]: session expired
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/SessionManager.cpp:286] class eiam::server::dirobj::Session *__cdecl eiam::server::poz::SessionManager::retrieveSession(const class eiam::core::String &)
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:3433] class eiam::server::dirobj::Session *__cdecl eiam::server::poz::Poz::retrieveSession(const class eiam::core::String &)
ERROR   2019-08-21 11:23:33,224 [0x00002358]    [eiam.server.ipoz.sponsorinterfacev1]   [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)
 INFO   2019-08-21 11:42:41,448 [0x00000bec]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:09:29,304 [0x000012b0]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:10:15,716 [0x00001150]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:14:52,428 [0x000003b8]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:19:49,863 [0x00001e48]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
 INFO   2019-08-21 13:24:26,612 [0x00001858]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
INFO    2019-08-21 13:27:33,143 [0x00001698]    [eiam.server.ipoz.sponsor]  Sponsor::Sponsor: Poz initialized
0 Karma
1 Solution

insert_regex_he
Explorer

I figured this out. First the delimiter in props.conf file shouldn't be in quotes (seems odd, I expect a string to be in quotes).

Secondly I removed "queue = parsingQueue" from inputs.conf.

<inputs.conf>
[monitor://F:\Program Files\CA\SC\EmbeddedEntitlementsManager\logs\ipoz.log]
index=casuite
sourcetype=ca_eem_ipoz
disabled=0

<props.conf>
[ca_eem_ipoz]
FIELD_NAMES="Priority","Date","Thread","Category","Message"
FIELD_DELIMITER=\t

<transforms.conf>
#empty

View solution in original post

insert_regex_he
Explorer

I figured this out. First the delimiter in props.conf file shouldn't be in quotes (seems odd, I expect a string to be in quotes).

Secondly I removed "queue = parsingQueue" from inputs.conf.

<inputs.conf>
[monitor://F:\Program Files\CA\SC\EmbeddedEntitlementsManager\logs\ipoz.log]
index=casuite
sourcetype=ca_eem_ipoz
disabled=0

<props.conf>
[ca_eem_ipoz]
FIELD_NAMES="Priority","Date","Thread","Category","Message"
FIELD_DELIMITER=\t

<transforms.conf>
#empty

diogofgm
SplunkTrust
SplunkTrust

I believe you are missing something in the REGEX on your second attempt.

Try it like this:

props.conf

[ipoz]
REPORT-IPOZ=IPOZ-DELIM

transforms.conf

[IPOZ-DELIM]
REGEX=(?<Priority>[^\t\s]+)[\t\s]+(?<Date>[^\t\s]+)[\t\s]+(?<Thread>[^\t\s]+)[\t\s]+(?<Category>[^\t\s]+)[\t\s]+(?<Message>.+)

You can validate the regex here: https://regex101.com/r/dcaJTn/1

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma

Sukisen1981
Champion

you should be able to extract this using regex , try tinkering wih your conf files as a last resort in this case

| makeresults
| eval Description="ERROR    2019-08-21 10:53:32,386    [0x00001cb4]    [eiam.server.ipoz.sponsorinterfacev1]    [src/Poz.cpp:2047] bool __cdecl eiam::server::poz::Poz::detach(const class eiam::core::String &)"
| rex field=Description "(?<pri>.*?)\s+"| rex field=Description "\s+(?<date>.*?)\s+\[" | rex field=Description "\[+(?<thread>.*?)\]"| rex field=Description "\]\s+\[+(?<cat>.*?)\]"| rex field=Description ".*\[+(?<msg>.*?)\)"

If this is what you need, there is no need to hard code description, just replace rex field=Description with rex field=_raw

0 Karma

insert_regex_he
Explorer

I've got a bunch of similar logs to parse and I was really hoping to break it into fields on the forwarder and not the indexer.

0 Karma

Sukisen1981
Champion

ok but you are using \t or tab as delmiter, tried with \s or \s+ or just plain space or ' '
https://docs.splunk.com/Documentation/Splunk/7.3.1/Data/Extractfieldsfromfileswithstructureddata

Special value Props.conf representation
form feed \f
space space or ' '
horizontal tab \t or tab
vertical tab \v
whitespace whitespace
none none or \0
file separator fs or \034
group separator gs or \035
record separator rs or \036
unit separator us or \037

0 Karma

Sukisen1981
Champion

hi @insert_regex_here
Did you try the above options?

0 Karma

insert_regex_he
Explorer

I can't use a space because the last field contains spaces. I think tab is most appropriate for my data.

0 Karma

Sukisen1981
Champion

i am guessing a few fields here of course, particularly your message,category fields

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...