Splunk Search

Regex to return text over multiple lines

ahogbin
Communicator

Hello,

I am trying to extract several lines of text using regex and whilst I can extract up to the first carriage return I cannot work out how to extract the subsequent line

The below is the text I am attempting to extract
[29/07/17 23:33:22:707 EST] 0000003e SystemOut O 23:33:22.707 [WebContainer : 4] ERROR c.a.r.l.controller.NotifyController - OOps
javax.xml.ws.soap.SOAPFaultException: Failed to process response headers

And the regex I am using is
rex "[.*?(?P[^\r\n]+)"

The output using the above is
[29/07/17 23:33:22:707 EST] 0000003e SystemOut O 23:33:22.707 [WebContainer : 4] ERROR c.a.r.l.controller.NotifyController - OOps

How can I expand the above regex to capture the second line (javax.xml.ws.soap.SOAPFaultException: Failed to process response headers) ?

Help will be greatly appreciated.

Cheers,

Alastair

Tags (2)
1 Solution

ahogbin
Communicator

worked out the solution.

rex "(?ms)(?P<ERR>^.*?(?=at))"

This gives me all lines up the , but not including, the first 'at'

Thanks for the pointers and suggestions.

View solution in original post

ahogbin
Communicator

worked out the solution.

rex "(?ms)(?P<ERR>^.*?(?=at))"

This gives me all lines up the , but not including, the first 'at'

Thanks for the pointers and suggestions.

woodcock
Esteemed Legend

Don't forget to upvote the helpful homies.

0 Karma

woodcock
Esteemed Legend

You need to prefix your RegEx with (?ms) which will cause the . token to include [\r\n] and also to process multi-line.

FritzWittwer_ol
Contributor

I assume you want the 3 lines starting with a timestamp, so i would use

[\d{2}\/\d{2}\/\d{2}\ (.*?[\r\n]){3}

niketn
Legend

@ahogbin you would need to paste your rex command again with the code button (101010) selected so that special characters do not escape.

From your question your intent is not very clear. You have pasted your event example and you are asking to extract the entire content using rex? Ideally you should define a pattern match/substring within main string. You would need to define regular expression flag to (?ms) to have Dot (.) match newline character as well. s flag ensures that dot matches newline character as well. Where m=> multiline and s=>singleline (Read reference details on regex101.com for the same.

PS: Since I do not have clarity following is just an example for showing syntax (however, do not consider this as your final query.

rex field=_raw "(?ms)(?<ExtractedData>.*)"

For us to assist you better, please clarify what substring you need to extract and what is your current regular expression.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

ahogbin
Communicator

Hello...
Thanks for the advice.. I have tried the various options below but none allow me to progress past retrieving the first 2 lines

[29/07/17 23:33:22:707 EST] 0000003e SystemOut O 23:33:22.707 [WebContainer : 4] ERROR c.a.r.l.controller.NotifyController - OOps

The whole string I am trying to extract is
[29/07/17 23:33:22:707 EST] 0000003e SystemOut O 23:33:22.707 [WebContainer : 4] ERROR c.a.r.l.controller.NotifyController - OOps
javax.xml.ws.soap.SOAPFaultException: Failed to process response headers

Regex I am using is rex field=_raw "(?ms)\[.*?(?P<ERR>[^\r\n]+)"

I know I am missing something but just cannot figure out what.

Cheers,

Alastair

0 Karma

niketn
Legend

It should actually be as following:

rex field=_raw "(?ms)(?<ERR>.*)"

Here Dot will also match newline charatcer i.e. \n\r

Alternative you can also try the following to see whether it is newline character(\n\r) or something else:

| eval rawWithoutNewLine=replace(_raw,"\n\r"," ")
| rex field=rawWithoutNewLine "(?ms)(?<ERR>.*)"
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

ahogbin
Communicator

Still not working... I am now just getting the entire output that spans many lines when I really just want to get the first 3 (well the first 2 really as the first line wraps)

[29/07/17 23:33:22:707 EST] 0000003e SystemOut O 23:33:22.707 [WebContainer : 4] ERROR c.a.r.l.controller.NotifyController - OOps
javax.xml.ws.soap.SOAPFaultException: Failed to process response headers
... 31 lines omitted ...

The regex I was using stops at the first return (after the word OOps. How do I get it to also include the second line and then stop at the end of line (javax.xml.ws.soap.SOAPFaultException: Failed to process response headers) ?

The provide regex (and thank you for this) also picks up the 31 omitted lines.

0 Karma

ahogbin
Communicator

I think I see the issue. The second part of the string is not terminated by a carriage return / new lines and looks to just continue on

13/07/17 23:07:44:186 EST] 00000040 SystemOut O 23:07:44.185 [WebContainer : 8] ERROR c.a.r.l.controller.NotifyController - OOps javax.xml.ws.soap.SOAPFaultException: Failed to process response headers at org.apache.axis2.jaxws.marshaller.impl.alt.MethodMarshallerUtils.createSystemException(MethodMarshallerUtils.java:1363) ~[org.apache.axis2.jar:na] at org.apache.axis2.jaxws.marshaller.impl.alt.

This explains why I am getting the full output rather than stopping at the first 'at'

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...