I have an event field in the format of fieldTitle=Type: This is a description
. Sometimes this event field contains an ampersand (&) in it, and when extracting the value of that field Splunk will stop and not pull the rest of the field. For example:
fieldTitle=Type: This & That
Splunk will display the value of fieldTitle
as This
.
In my regex I've tried escaping the ampersand, I've tried its hex and unicode equivalent values, and I've even tried a .*
which should match on everything regardless. None of these result in a match beyond the ampersand.
I've also tried the field extraction tool, and aside from it generating a very long and static regex that isn't as dynamic as I need, it also does not work when I call it in a search.
Has anyone had this same issue? I'm on Splunk 6.2.
This regular expression seems to have fixed it, however, it will not work if this field is at the end of the event. In that case I could probably add a \n
match as well.
| rex field=_raw "(?:nitroBehavior=)(?<behavior>(.*?)(?=src))"
Thanks for the help and getting me on the right direction everyone.
This regular expression seems to have fixed it, however, it will not work if this field is at the end of the event. In that case I could probably add a \n
match as well.
| rex field=_raw "(?:nitroBehavior=)(?<behavior>(.*?)(?=src))"
Thanks for the help and getting me on the right direction everyone.
It's a bit confusing as to what you want exactly in the new fieldname
because of your second example... but If the src=
field is always following the nitroBehavior=
field you can use this:
nitroBehavior=(?<nitro>.+)\ssrc
Basically I think Splunk, when it automagically grabs the key value pairs (which it will do when it sees an =
) sees the ampersand as another delimiter and stops... so first, you want to re-assign the nitroBehavior
field (I called the field nitro above but you can call it nitroBehavior and it will take prescience over the auto assigned one.
You can't use the field as is... since the text isn't surrounded by double quotes... and it's in a space delimited event (not nice 3rd party SIEM!) Splunk really just has to go with "best guess" and in this case, that's not good enough.
So grab the nitroBehavior
field:
nitroBehavior=(?<nitroBehavior>.+)\ssrc
And then you cay say
...|rex field=nitroBehavior "Botnet:\s(?<botnet>.+)\ssrc
Or if that subfield is a pattern, you can grab it in transforms with a dynamic field name
[nitroBehaviorInsides]
SOURCE_KEY = nitroBehavior #(the new one)
REGEX = (\w+):\s(.+)\ssrc
FORMAT $1::$2
That'll grab both key and value pair for all the different messages.
I don't believe that field always precedes a specific field, I've seen it at the very end of the alert before as well going back through my event data. I will try the above regex, and perhaps provide more examples of variance in the events
With the above suggestion:
I'm doing |eval nitroBehavior=(?P<nitroBehavior>.+\ssrc
and it's throwing an error saying "An unexpected character is reached at ?P<nitroBehavior>.+"
This regex |rex field=_raw "(?:nitroBehavior=)(?<behavior>.+[^\ssrc])"
captures the full value, but it does not stop at the next match of "src". It prints: "Botnet: GB Custom Signature C&C Traffic From DNS src", and the same happens if I just do |rex field=_raw "(?:nitroBehavior=)(?<behavior>.+\ssrc)"
Excellent moniker IngloriousSplunker!
you need to show sample data and your regex. It's not Splunk stopping on the ampersand... it's your regex syntax and the event. the & isn't special in any way...
Can i get your sample event?
I think you can do someting like this:
...|rex fields=_raw "fieldTitle\=Type\: (?<fieldname>[^\n])"
depending of the end of your description. If not working, let me get your sample event.
Thanks
I did not try fields=_raw
in the rex component, instead I designated field=nitroBehavior
, which is the field I wanted to perform the regex on. I may try _raw tomorrow and just ignore everything up to the field I want to see if that changes the result any.
It's sensor event data from another SIEM. Below is a sample:
May 7 2015 15:36:21 forwarding-system-hostname.domain.com 2015-05-07T15:36:21.201Z|ESM|CEF|358|McAfee NTR Incident start= 1430987152 end= 1430987152 rt=1430990752 deviceExternalId=Sensor-A eventId=1234 nitroNormId=123588 nitroObjectId=Malware: Botnet nitroBehavior=Botnet: GB Custom Signature C&C Traffic From DNS src=1.2.3.4 dst=5.6.7.8 nitroCat=Misc nitroDom=Domain
The above is a representation of the event I'm format I'm having an issue with. The field I'm having an issue with is "nitroBehavior". Splunk auto-parses the field, however, it extracts the value as "Botnet: Custom Signature C", and I've tried numerous regular expressions to include | rex field=nitroBehavior "(?P<fieldname>.*)"
and |rex field=nitroBehavior "(?P<fieldname>[^nitro])"
and other variations that should work, including using the hex and unicode representations of the ampersand. Every time, it captures "Botnet: Custom Signature C", but never goes beyond the ampersand.