Splunk Search

Field extraction regex issues

svarendorff
Explorer

Having some issue with extraction.

source:

SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

https://regex101.com/ shows that ^[^\.\n]*SESSION:(?P<Session>.*) will work.

Splunk when trying returns almost the complete message. Almost like it does not see the new line

 

Basically I want from SESSION: to the end of line and if Splunk cannot do that to Client.

 

 

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC

View solution in original post

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC
Tags (1)
0 Karma

svarendorff
Explorer

Thank you Yuanliu. New to this type of items in Splunk so very happy for any advice and assistance.

So

| rex "SESSION: (?<Session>.+)"

works in a search

svarendorff_0-1671762088672.png

However, as an extraction I get nothing. Likely overlap with another similar that looks for Activity. May be best for these to do just in the search hard coded as the events have so many different items.

E,g, the second line has Session, Activity, User, etc that I would like to extract.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In SPL, newline is often represented by \s (as opposed to \n).  I don't know the exact rule to be frank.  So, you can try something like this in field extraction

"TIME: \d{4}(-\d\d){2} \d\d:\d\d:\d\d\.\d{3}\sSESSION: (?<Session>.+)\sClient address:"

It works with rex.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @svarendorff,

please try this regex :

(?ms)^[^\.\n]*SESSION:(?P<Session>.*)\nClient\s+address

Ciao.

Giuseppe

svarendorff
Explorer

Hi Giuseppe,

Looks like that fails. e.g. zero counts for index = XXXXXXX | stats count by Session.

I did forget to put a line in the source so likely that is the issue.

--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

 

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...