Splunk Search

Regex for multi-value field in which some values are listed and then aren't

jwalzerpitt
Influencer

I am trying to create a regex for a multivalue field (Message) in which some values are listed and sometimes aren't listed depending on the event. We are ingesting Shibboleth logs via _json format, and I am trying to extract three values from the Message field: URL, username, and src_ip (in bold in each event).

There are three different events for Shibboleth.
alt text

Is it possible to create a regex that would apply to all three events?

I have one regex that covers the first event and extracts the three fields.

alt text

Thx

0 Karma

inventsekar
SplunkTrust
SplunkTrust

maybe, the user name and src_ip are looking good. maybe, club these two in a single rex and use a separate rex for URL.

(photo is fine for reading) maybe, Can you please copy the logs and your rex as a text, so that we test it.

0 Karma

jwalzerpitt
Influencer

I've been struggling with the frustrating code tag markdown as I selected the code button, which adds the tickmarks to the beginning and end of the code, but the page still yells at me when I go to post it

0 Karma

inventsekar
SplunkTrust
SplunkTrust

posting it on the comment would be difficult. maybe, please post it as a separate answer or edit your question and add the text please.

0 Karma

jwalzerpitt
Influencer

I believe I figured it out - I had to create three separate regexes, one for each field, and when evaluating, I did not see any Non-Matches for each regex. Regexes are as follows:

^(?:[^|\n]|){13}(?P[^|]+)
^(?:[^|\n]
|){3}(?P[^|]+)
^(?:[^|\n]*|){8}(?P[^|]+)

0 Karma

sundareshr
Legend

It would be safer to create three separate regexes. That way extraction is not affected by minor changes to the log format. I would suggest, in the field extraction UI, create three separate field extraction rules.

*UPDATED*

props.conf
[stanza_name]
REPORT-extract_mv_fields: extract_url extract_src_ip extract_user

transforms.conf
[extract_url]
REGEX=(?<url>http[^\|]+)
MV_ADD=true

[extract_src_ip]
REGEX=(?<url>\d+\.\d+\.\d+\.\d+)
MV_ADD=true

[extract_user]
REGEX=\|{4}(?<user>\w+)\|{5}"
MV_ADD=true
0 Karma

jwalzerpitt
Influencer

I tried to do three separate regexes, but Splunk yelled when I tried to reuse the field extracted names (url, username, src_ip) for the second regex (logout with username).

Thx

0 Karma

sundareshr
Legend

Try the updated ans

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...