Splunk Search

Why Does Regex Not Match Ampersand?

IngloriousSplun
Communicator

I have an event field in the format of fieldTitle=Type: This is a description. Sometimes this event field contains an ampersand (&) in it, and when extracting the value of that field Splunk will stop and not pull the rest of the field. For example:

fieldTitle=Type: This & That Splunk will display the value of fieldTitle as This.

In my regex I've tried escaping the ampersand, I've tried its hex and unicode equivalent values, and I've even tried a .* which should match on everything regardless. None of these result in a match beyond the ampersand.

I've also tried the field extraction tool, and aside from it generating a very long and static regex that isn't as dynamic as I need, it also does not work when I call it in a search.

Has anyone had this same issue? I'm on Splunk 6.2.

0 Karma
1 Solution

IngloriousSplun
Communicator

This regular expression seems to have fixed it, however, it will not work if this field is at the end of the event. In that case I could probably add a \n match as well.

| rex field=_raw "(?:nitroBehavior=)(?<behavior>(.*?)(?=src))"

Thanks for the help and getting me on the right direction everyone.

View solution in original post

0 Karma

IngloriousSplun
Communicator

This regular expression seems to have fixed it, however, it will not work if this field is at the end of the event. In that case I could probably add a \n match as well.

| rex field=_raw "(?:nitroBehavior=)(?<behavior>(.*?)(?=src))"

Thanks for the help and getting me on the right direction everyone.

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

It's a bit confusing as to what you want exactly in the new fieldnamebecause of your second example... but If the src= field is always following the nitroBehavior= field you can use this:

nitroBehavior=(?<nitro>.+)\ssrc

Basically I think Splunk, when it automagically grabs the key value pairs (which it will do when it sees an =) sees the ampersand as another delimiter and stops... so first, you want to re-assign the nitroBehavior field (I called the field nitro above but you can call it nitroBehavior and it will take prescience over the auto assigned one.

You can't use the field as is... since the text isn't surrounded by double quotes... and it's in a space delimited event (not nice 3rd party SIEM!) Splunk really just has to go with "best guess" and in this case, that's not good enough.

So grab the nitroBehavior field:
nitroBehavior=(?<nitroBehavior>.+)\ssrc
And then you cay say
...|rex field=nitroBehavior "Botnet:\s(?<botnet>.+)\ssrc

Or if that subfield is a pattern, you can grab it in transforms with a dynamic field name

[nitroBehaviorInsides]
SOURCE_KEY = nitroBehavior #(the new one)
REGEX = (\w+):\s(.+)\ssrc
FORMAT $1::$2

That'll grab both key and value pair for all the different messages.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma

IngloriousSplun
Communicator

I don't believe that field always precedes a specific field, I've seen it at the very end of the alert before as well going back through my event data. I will try the above regex, and perhaps provide more examples of variance in the events

0 Karma

IngloriousSplun
Communicator

With the above suggestion:

I'm doing |eval nitroBehavior=(?P&lt;nitroBehavior&gt;.+\ssrc and it's throwing an error saying "An unexpected character is reached at ?P<nitroBehavior>.+"

0 Karma

IngloriousSplun
Communicator

This regex |rex field=_raw "(?:nitroBehavior=)(?&lt;behavior&gt;.+[^\ssrc])" captures the full value, but it does not stop at the next match of "src". It prints: "Botnet: GB Custom Signature C&C Traffic From DNS src", and the same happens if I just do |rex field=_raw "(?:nitroBehavior=)(?&lt;behavior&gt;.+\ssrc)"

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

Excellent moniker IngloriousSplunker!
you need to show sample data and your regex. It's not Splunk stopping on the ampersand... it's your regex syntax and the event. the & isn't special in any way...

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma

stephanefotso
Motivator

Can i get your sample event?
I think you can do someting like this:

...|rex fields=_raw "fieldTitle\=Type\: (?<fieldname>[^\n])"

depending of the end of your description. If not working, let me get your sample event.
Thanks

SGF
0 Karma

IngloriousSplun
Communicator

I did not try fields=_raw in the rex component, instead I designated field=nitroBehavior, which is the field I wanted to perform the regex on. I may try _raw tomorrow and just ignore everything up to the field I want to see if that changes the result any.

0 Karma

IngloriousSplun
Communicator

It's sensor event data from another SIEM. Below is a sample:

May  7 2015 15:36:21 forwarding-system-hostname.domain.com 2015-05-07T15:36:21.201Z|ESM|CEF|358|McAfee NTR Incident start= 1430987152 end= 1430987152 rt=1430990752 deviceExternalId=Sensor-A eventId=1234 nitroNormId=123588 nitroObjectId=Malware: Botnet nitroBehavior=Botnet: GB Custom Signature C&C Traffic From DNS src=1.2.3.4 dst=5.6.7.8 nitroCat=Misc nitroDom=Domain

The above is a representation of the event I'm format I'm having an issue with. The field I'm having an issue with is "nitroBehavior". Splunk auto-parses the field, however, it extracts the value as "Botnet: Custom Signature C", and I've tried numerous regular expressions to include | rex field=nitroBehavior "(?P&lt;fieldname&gt;.*)" and |rex field=nitroBehavior "(?P&lt;fieldname&gt;[^nitro])" and other variations that should work, including using the hex and unicode representations of the ampersand. Every time, it captures "Botnet: Custom Signature C", but never goes beyond the ampersand.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...