Splunk Search

Field extraction regex issues

svarendorff
Explorer

Having some issue with extraction.

source:

SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

https://regex101.com/ shows that ^[^\.\n]*SESSION:(?P<Session>.*) will work.

Splunk when trying returns almost the complete message. Almost like it does not see the new line

 

Basically I want from SESSION: to the end of line and if Splunk cannot do that to Client.

 

 

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC

View solution in original post

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Why struggle with \n when this will do? (New line is one character in SPL's PCRE that is not fully PCRE conformant.)

| rex "SESSION: (?<Session>.+)"

This is my emulation

| makeresults
| fields - _time
| eval _raw = "--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC"
``` data emulation above ```

The result is

Session_raw
Session closed--TIME: 2022-12-23 07:17:09.399 SESSION: Session closed Client address: 123.CCCCCCC Client name: CC222C22[123.123.12.123] User interface: CCCCCCC
Tags (1)
0 Karma

svarendorff
Explorer

Thank you Yuanliu. New to this type of items in Splunk so very happy for any advice and assistance.

So

| rex "SESSION: (?<Session>.+)"

works in a search

svarendorff_0-1671762088672.png

However, as an extraction I get nothing. Likely overlap with another similar that looks for Activity. May be best for these to do just in the search hard coded as the events have so many different items.

E,g, the second line has Session, Activity, User, etc that I would like to extract.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In SPL, newline is often represented by \s (as opposed to \n).  I don't know the exact rule to be frank.  So, you can try something like this in field extraction

"TIME: \d{4}(-\d\d){2} \d\d:\d\d:\d\d\.\d{3}\sSESSION: (?<Session>.+)\sClient address:"

It works with rex.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @svarendorff,

please try this regex :

(?ms)^[^\.\n]*SESSION:(?P<Session>.*)\nClient\s+address

Ciao.

Giuseppe

svarendorff
Explorer

Hi Giuseppe,

Looks like that fails. e.g. zero counts for index = XXXXXXX | stats count by Session.

I did forget to put a line in the source so likely that is the issue.

--TIME: 2022-12-23 07:17:09.399
SESSION: Session closed
Client address: 123.CCCCCCC
Client name: CC222C22[123.123.12.123]
User interface: CCCCCCC

 

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...