Splunk Search

Matching nulls in LINE_BREAKER removing first letter x?

mikel8
Explorer

Hopefully this is just a stupid regex error:

I'm using SplunkLightForwarder on AIX to send a few .sh_history logs to an indexer on Windows. Unfortunately ksh uses nulls as deliminators between commands--and it sometimes throws an extra null in for no apparent reason. This makes the Splunk events look something like this:

Event 1

cd /etc
\x00\x00ls

Event 2

mkdir test
\x00cd test    

In other words, multiple events are incorrectly merged, and nulls are sprinkled throughout the logs. I spent a good deal of time trying to solve this (line merge/break settings, transforms, etc.). I ended up with the following in props.conf on my indexer:

[sourcetype]
LINE_BREAKER=(\\x00+)

This works beautifully, except when I exit the shell after testing this out, what shows up in Splunk?

eit

I can't figure out how in the world my regex is matching the x in exit. I later changed it to

LINE_BREAKER=((?:\\x00)+)

but it still eats the first 'x' in every event (axbxcx becomes abxcx). I've verified that there are no nulls adjacent to the x in the source.

Thanks in advance for your help!

Example data, zipped: http://www.mediafire.com/file/wwckoeo36v8p0v6/ksh-history-example.zip

$ tr "\000" "@" < ksh-history-example
mkdir -p test1/test2/test3
@cd test1
@ls
@cd test2
@ls
@cd test3
@ls
@cd ..
@@cd ..
@@ls
@cd ..
@@pwd
@@
Tags (3)
2 Solutions

Ron_Naken
Splunk Employee
Splunk Employee

You can strip them using SEDCMD, instead of using LINE_BREAKER to break on the nulls:

[mysourcetype]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00//g

EDIT:

There's only limited room for comments. You can use this SEDCMD to replace with linebreaks:

[mysourcetype]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00/\n/g

View solution in original post

Ron_Naken
Splunk Employee
Splunk Employee

Awesome, glad you were able to get it to work! Next time you need to use SEDCMD, keep in mind that you can use multiple sed's with a single SEDCMD. For instance:

[nulls]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00/\n/g s/\n{2,}/\n/g s/^[\n]*$//g

In addition to replacing nulls with \n's, this should strip any lines that contain all \n's, as well as convert any multiple \n's into singles. (Posted to illustrate 3 sed's in 1)

Cheers

View solution in original post

Ron_Naken
Splunk Employee
Splunk Employee

Awesome, glad you were able to get it to work! Next time you need to use SEDCMD, keep in mind that you can use multiple sed's with a single SEDCMD. For instance:

[nulls]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00/\n/g s/\n{2,}/\n/g s/^[\n]*$//g

In addition to replacing nulls with \n's, this should strip any lines that contain all \n's, as well as convert any multiple \n's into singles. (Posted to illustrate 3 sed's in 1)

Cheers

mikel8
Explorer

Posting here due to the limited comment space:

Thanks for all your help ron! Replacing with \n is ALMOST perfect. If that's all that's in the stanza, the events are still not split. All of the following set ups DO split the events, but there are newlines at the start of some events, which throws a wrench into trying to match those events up later on. Any ideas on how to get rid of the newlines?

[mysourcetype]
NO_BINARY_CHECK=true
SEDCMD-stripnull=s/\\x00/\n/g
SHOULD_LINEMERGE=false

[mysourcetype]
NO_BINARY_CHECK=true
SEDCMD-stripnull=s/\\x00/\n/g
SHOULD_LINEMERGE=true
LINE_BREAKER=([\n]+)
BREAK_ONLY_BEFORE_DATE=false

[mysourcetype]
NO_BINARY_CHECK=true
SEDCMD-stripnewline=s/[\r\n]+//g
SEDCMD-stripnull=s/\\x00/\n/g
SHOULD_LINEMERGE=true
LINE_BREAKER=([\n]+)
BREAK_ONLY_BEFORE_DATE=false

Side note: After quite a bit of testing, I can say for certain that changing SEDCMD (and possibly other settings) in props.conf on the indexer shows up immediately in btool output, but it is not applied to forwarded input until Splunk is restarted! Frustrating.

mikel8
Explorer

Just had to change the sed to replace multiple nulls with one \n: SEDCMD-stripnull = s/(?:\x00)+/\n/g. Thanks again ron!

Ron_Naken
Splunk Employee
Splunk Employee

You can strip them using SEDCMD, instead of using LINE_BREAKER to break on the nulls:

[mysourcetype]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00//g

EDIT:

There's only limited room for comments. You can use this SEDCMD to replace with linebreaks:

[mysourcetype]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00/\n/g

mikel8
Explorer

(Posted reply as an answer.)

0 Karma

mikel8
Explorer

It appears that | extract reload=true is insufficient--the SEDCMD is working as expected after restarting splunk. I'll see if I can use it to set up a custom linebreak as you suggested.

mikel8
Explorer

Whether they're merged seems to depend on the timing: when lines are added to the source rapidly (1/sec or so) they're merged, otherwise each entry is a separate event.

I now have:

[sourcetype]
NO_BINARY_CHECK = true
SEDCMD-stripnull = s/\\x00/ZzZ/g

I've verified with btool that this is being applied to my sourcetype, but after creating new events, there are no ZzZ strings showing up... Am I misunderstanding something?

0 Karma

Ron_Naken
Splunk Employee
Splunk Employee

Interesting. I'm using a Mac, and stripping the nulls allows it to break the lines properly. At least with SEDCMD, you could substitute a newline or custom linebreak for the nulls.

mikel8
Explorer

Thanks for the help! Your solution both removes the nulls and doesn't touch 'x's, but multiple commands are now merged into one event, even when adding SHOULD_LINEMERGE=false...

0 Karma

Ron_Naken
Splunk Employee
Splunk Employee

I appended some nulls and 'exit' and some more nulls to your sample data. The SEDCMD seems to do the job.

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...