Splunk Search

Search text with regex multi line mode?

New Member

Hi,

I'm trying to search for some keywords that appear in multiple lines. I tried using regular expression in multi line mode (?m) but it does not work.

In the search box, I put

host=dev* | regex _raw="(?m)*POST*Can't read the image!*"

I got the following error: Error in 'SearchOperator:regex': Invalid regex '(?m)Can't read the image!': nothing to repeat

I'm on Splunk 4.0.8.

Any input would be appreciated. Thank you.

Tags (3)
0 Karma
2 Solutions

Splunk Employee
Splunk Employee

minalenan: Additionally, reading your data below it appears that you might be consuming your data in a multiple 'event' fashion -- not a multiple 'line' fashion.

2010-08-10 18:18:17,243 [http-8080-20 ][xxx.xxx.xxx.xxx]: INFO: POST /some_url

As you point out, splunk is interpreting this as two separate events and I believe you won't be able to achieve pulling this together in this fashion (If that's what you're trying to do).

2010-08-10 18:18:17,246 [http-8080-20 ][xxx.xxx.xxx.xxx]: DEBUG: Can't read the image!

Moreover, if you're wanting to do a simple search of these events you might want to create a simple search that will look for both nuggets. 🙂

Something like this:

host=dev* | search POST OR "Can't read the image!."

View solution in original post

0 Karma

Legend

Would this work for you? The transaction command will group events with the same ip address, where the first event has POST and the second has "Can't read the image". I arbitrarily specified that the two events should occur within 10 minutes of each other.

This solution requires that Splunk recognizes the IP address in your events. I am assuming that the name of the IP address field is ip_addr

host=dev* (post OR "Can't read the image!") | transaction ip_addr startswith=post endswith=image maxspan=10m

BTW, what is the sourcetype of these events? If I knew the sourcetype, I might be able to make clearer suggestions. Thanks!

View solution in original post

Legend

Would this work for you? The transaction command will group events with the same ip address, where the first event has POST and the second has "Can't read the image". I arbitrarily specified that the two events should occur within 10 minutes of each other.

This solution requires that Splunk recognizes the IP address in your events. I am assuming that the name of the IP address field is ip_addr

host=dev* (post OR "Can't read the image!") | transaction ip_addr startswith=post endswith=image maxspan=10m

BTW, what is the sourcetype of these events? If I knew the sourcetype, I might be able to make clearer suggestions. Thanks!

View solution in original post

New Member

This works. The sourcetype is custom for our application. I have added sourcetype to the query to narrow down the search results more. Thanks a lot!

0 Karma

Splunk Employee
Splunk Employee

minalenan: Additionally, reading your data below it appears that you might be consuming your data in a multiple 'event' fashion -- not a multiple 'line' fashion.

2010-08-10 18:18:17,243 [http-8080-20 ][xxx.xxx.xxx.xxx]: INFO: POST /some_url

As you point out, splunk is interpreting this as two separate events and I believe you won't be able to achieve pulling this together in this fashion (If that's what you're trying to do).

2010-08-10 18:18:17,246 [http-8080-20 ][xxx.xxx.xxx.xxx]: DEBUG: Can't read the image!

Moreover, if you're wanting to do a simple search of these events you might want to create a simple search that will look for both nuggets. 🙂

Something like this:

host=dev* | search POST OR "Can't read the image!."

View solution in original post

0 Karma

New Member

Yes, you are right. Splunk does interpret it as 2 separate events. Thanks.

0 Karma

Legend

It does appear that the (?m) syntax should be supported by Splunk. But I am unclear why you need it in this search. If you are searching for "something" followed by "POST" followed by "something" followed by "Can't read the image!" then I think you could use

host=dev* | regex _raw=".*POST.*Can't read the image!.*"

If you want the exact string *POST*Can't read the image!* then you can search for

host=dev* | regex _raw="\*POST\*Can't read the image!\*"
0 Karma

New Member

Thanks for the answer.

The word "Post" appears in a different line from "Can't read the image!" in the log files that Splunk indexed.

2010-08-10 18:18:17,243 [http-8080-20 ][xxx.xxx.xxx.xxx]: INFO: POST /some_url
2010-08-10 18:18:17,246 [http-8080-20 ][xxx.xxx.xxx.xxx]: DEBUG: Can't read the image!

0 Karma

Splunk Employee
Splunk Employee

This is working for me with version 4.1.4.

sourcetype=apilog | regex _raw="(?m)callerAction*"

Data Example:

#### 2010-08-10 18:52:45,177
     nameSpace:     content.static.API
     subscriber:    6129045580
     callerID:      TTCOV105440648-1368613
     driver:        content.jdbc.ContentDriver
     callerAction:  MAR10446LA
     host:              10.25.50.109
     connectionResult:  SUCCESS
     Details:       Successfully updated contentDB 

I would suggest an upgrade first.

EDIT: Another thing that it might be throwing up on is the single quote you have in there, try escaping it: Can\'t

0 Karma

Splunk Employee
Splunk Employee

You DO NOT, in fact, need the (?m) for the regex to work.

0 Karma

New Member

Thanks, Lamar. Unfortunately, I have no control over that. So, upgrading is not an option.

0 Karma

Legend

Lamar, do you really need the (?m) in your regex? I think it might work just as well without it.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!