I am trying to create a regex for a multivalue field (Message) in which some values are listed and sometimes aren't listed depending on the event. We are ingesting Shibboleth logs via _json format, and I am trying to extract three values from the Message field: URL, username, and src_ip (in bold in each event).
There are three different events for Shibboleth.
Is it possible to create a regex that would apply to all three events?
I have one regex that covers the first event and extracts the three fields.
Thx
maybe, the user name and src_ip are looking good. maybe, club these two in a single rex and use a separate rex for URL.
(photo is fine for reading) maybe, Can you please copy the logs and your rex as a text, so that we test it.
I've been struggling with the frustrating code tag markdown as I selected the code button, which adds the tickmarks to the beginning and end of the code, but the page still yells at me when I go to post it
posting it on the comment would be difficult. maybe, please post it as a separate answer or edit your question and add the text please.
I believe I figured it out - I had to create three separate regexes, one for each field, and when evaluating, I did not see any Non-Matches for each regex. Regexes are as follows:
^(?:[^|\n]|){13}(?P[^|]+)
^(?:[^|\n]|){3}(?P[^|]+)
^(?:[^|\n]*|){8}(?P[^|]+)
It would be safer to create three separate regexes. That way extraction is not affected by minor changes to the log format. I would suggest, in the field extraction UI, create three separate field extraction rules.
*UPDATED*
props.conf
[stanza_name]
REPORT-extract_mv_fields: extract_url extract_src_ip extract_user
transforms.conf
[extract_url]
REGEX=(?<url>http[^\|]+)
MV_ADD=true
[extract_src_ip]
REGEX=(?<url>\d+\.\d+\.\d+\.\d+)
MV_ADD=true
[extract_user]
REGEX=\|{4}(?<user>\w+)\|{5}"
MV_ADD=true
I tried to do three separate regexes, but Splunk yelled when I tried to reuse the field extracted names (url, username, src_ip) for the second regex (logout with username).
Thx
Try the updated ans