I have an XML file where, for some reason, some control characters were printed as ascii strings,
\x0a being a great example. I can use
makemv split="\x0a" and get a nice splitting of based on that newline character, but I've been trying to use rex and capture variable to split the record into named fields I can use in stats. Rex, however, doesn't seem to respect any way I try to represent the literal
\x0a. I have tried
\\x0a and just
x0a, but it seems rex must be interpreting it as a newline, even though its not represented that way in the raw record. Cause it won't extract anything into the fields I'm trying to make, but if I use something like
(?.*)\s+ I get some data in my field1, but not the data I want, cause I don't want to split on " " I want to split on
My _raw record looks something like
Data: somevalue\x0aData2 date-string\x0aData3 some-string with spaces\x0aData4
so the most consistent want to split it would be the
\x0a and not some other combination of things.
Here's a lesson I learned a long time ago when using Splunk against embedded backslashes. When in doubt, just add more escaping backslashes.
Witness the following vision of loveliness:
| makeresults | eval BigString="somevalue\x0aData2 date-string\x0aData3 some-string with spaces\x0aData4" | rex field=BigString "\\\x0a(?<myCapture>[^\\\]*)\\\x0a"
The first line is just me making a run-anywhere example, creating a field called BigString for me to use later.
The second is my rex with capture group. That returns a field "myCapture" = "Data2 date-string". You obviously can extend the above as many captures as you'd want, just split them out with
Use sed command to replace unwanted \x0a with a space or any character you want:
--SomeSearch--| eval data = "somevalue\x0aData2 date-string\x0aData3 some-string with spaces\x0aData4" | rex field=data mode=sed "s/(\\x0a)/ /g"| table data
You can also use sedcmd in props.conf to automatically apply this to your data.