Yes, Splunk can. You can use SEDCMD- to rewrite the events to remove the \x00
s, which by the time the data hits an indexer are already the text "\x00" - they're no longer the null byte.
On the search bar:
| rex mode=sed "s/\\\\x00//g"
Automatically at parsing ("indexing") time for any new data, in props.conf:
[yoursourcetype]
SEDCMD-remove_nulls = s/\\x00//g
LINE_BREAKER = ((?:[\r\n](?:\\x00)?)+)
Special LINE_BREAKER was added because Splunk was interpreting the null bytes between \r and \n (the two halves of the Windows newline, in the file I was working on) as additional lines and adding them to the event. It says use "(one newline character optionally followed by the text \ x 0 0) one or more times
" as the breaker (thrown away) between events.
What is the /g part? What if I just wanted to delete the characters and/or just swap them w/ nothing?
xxx-xxx-xxxx is now xxxxxxxxxx
It is best to not post additional questions in the answer section. Post them as a question so they get proper visibility.
/g means globally - it will replace every instance of the subject that it finds, not just the first one.
s/-//g would swap app dash with nothing. but if you did s/-// with no g, you would end up with xxxxxx-xxxx.
I tried the UTF-16LE as mentioned here but it did not work. But now that I think about it, I might have put the config on the indexer, not the universal forwarder. Oops. Config below still works when put on the indexer.
Have a look at SEDCMD - Admin Manual - Props.conf
Adding this to your props.conf should work:
SEDCMD-StripNULL= s/\x00//g
This sounds like a character encoding problem to me.
If the log is encoded as UTF-16, only contains UTF-8 and is being read as UTF-8, then there'll be extra \x00 between each character.
Find out what character encoding the messages use, then set the charset in splunk
Yes, Splunk can. You can use SEDCMD- to rewrite the events to remove the \x00
s, which by the time the data hits an indexer are already the text "\x00" - they're no longer the null byte.
On the search bar:
| rex mode=sed "s/\\\\x00//g"
Automatically at parsing ("indexing") time for any new data, in props.conf:
[yoursourcetype]
SEDCMD-remove_nulls = s/\\x00//g
LINE_BREAKER = ((?:[\r\n](?:\\x00)?)+)
Special LINE_BREAKER was added because Splunk was interpreting the null bytes between \r and \n (the two halves of the Windows newline, in the file I was working on) as additional lines and adding them to the event. It says use "(one newline character optionally followed by the text \ x 0 0) one or more times
" as the breaker (thrown away) between events.
this helped me out. thank you.