Splunk Search

Field extraction regex issues

svarendorff
Explorer

Having some issue with extraction.

source:

SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

https://regex101.com/ shows that ^[^\.\n]*SESSION:(?P<Session>.*) will work.

Splunk when trying returns almost the complete message. Almost like it does not see the new line

 

Basically I want from SESSION: to the end of line and if Splunk cannot do that to Client.

 

 

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC

View solution in original post

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC
Tags (1)
0 Karma

svarendorff
Explorer

Thank you Yuanliu. New to this type of items in Splunk so very happy for any advice and assistance.

So

| rex "SESSION: (?<Session>.+)"

works in a search

svarendorff_0-1671762088672.png

However, as an extraction I get nothing. Likely overlap with another similar that looks for Activity. May be best for these to do just in the search hard coded as the events have so many different items.

E,g, the second line has Session, Activity, User, etc that I would like to extract.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In SPL, newline is often represented by \s (as opposed to \n).  I don't know the exact rule to be frank.  So, you can try something like this in field extraction

"TIME: \d{4}(-\d\d){2} \d\d:\d\d:\d\d\.\d{3}\sSESSION: (?<Session>.+)\sClient address:"

It works with rex.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @svarendorff,

please try this regex :

(?ms)^[^\.\n]*SESSION:(?P<Session>.*)\nClient\s+address

Ciao.

Giuseppe

svarendorff
Explorer

Hi Giuseppe,

Looks like that fails. e.g. zero counts for index = XXXXXXX | stats count by Session.

I did forget to put a line in the source so likely that is the issue.

--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

 

0 Karma
Get Updates on the Splunk Community!

.conf25 Community Recap

Hello Splunkers, And just like that, .conf25 is in the books! What an incredible few days — full of learning, ...

Splunk App Developers | .conf25 Recap & What’s Next

If you stopped by the Builder Bar at .conf25 this year, thank you! The retro tech beer garden vibes were ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...