Splunk Search

Finding events that do not contain a specific string

teedilo
Path Finder

We have some issues with line breaking such that we have events that often consist of multiple logical events, or they might consist of fragments of logical records.  We've tried a variety of fixes but no joy so far.  Anyway, that's not really what my question is about.  I'm trying to do a Splunk search that finds only "good" events as in "Scenario 1" below, where the event begins with the XML tag <record> and ends with </record>.  There should be no other tags like this in the event, which would indicate an event like in "Scenario 2", which contains multiple logical events merged together.  

Scenario 1:

 

<record>
blah blah
blah
</record>

 

Scenario 2:

 

<record>blah blah
blah</record>
<record>
blah
blah blah</record>
<record>
blah
blah blah</record>

 

I learned that this can be accomplished outside of Splunk using a "negative lookbehind".  I tried this in a Splunk search like the one below.

| regex "(?s)^<record>((?!<record>)(\s|\S))*<\/record>$"

I couldn't get this to work, however.  This does work as expected in regex101:

Any ideas?  I used the regex command instead of the rex command because I didn't need to extract anything.  Thanks in advance.

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Technically, you are using a negative lookahead not lookbehind, but it is what you want. Which version of splunk are you using as this works on 8.1.3 and 7.3.3

 

| makeresults 
| eval events=split("<record>
blah blah
blah
</record>|<record>blah blah
blah</record>
<record>
blah
blah blah</record>
<record>
blah
blah blah</record>","|")
| mvexpand events
| rename events as _raw
| regex "(?s)^<record>((?!<record>)(\s|\S))*<\/record>$"

 

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Technically, you are using a negative lookahead not lookbehind, but it is what you want. Which version of splunk are you using as this works on 8.1.3 and 7.3.3

 

| makeresults 
| eval events=split("<record>
blah blah
blah
</record>|<record>blah blah
blah</record>
<record>
blah
blah blah</record>
<record>
blah
blah blah</record>","|")
| mvexpand events
| rename events as _raw
| regex "(?s)^<record>((?!<record>)(\s|\S))*<\/record>$"

 

teedilo
Path Finder

Thanks for clarifying on the terminology.  I had copied the terminology from another posting so they must have misused it as well.  In any case, good to know it works in a newer version of Splunk.  I was afraid you'd ask which version of Splunk we're on because I'm embarrassed to say that we're still on 5.0.1.  We've had no end of difficulties upgrading and dealing with other problems in Splunk.  IMHO Splunk is a very difficult product to administer and we just don't have the time it apparently takes to do a good job with it.  Thanks for trying to help anyway.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...