Getting Data In

Help with props.conf, LINE_BREAKER

Megamuch
Engager

While testing out Splunk I wanted to see if I could easily create a custom input into splunk using ncat and the UDP splunk input.

The input works, now I have to tell splunk how to split the input stream.

The input is a multiline string which contains either XML or pipe (|) delimited data but is always terminated by ~\

So I created a new props.conf in %$SPLUNK_HOME%/etc/system/local/ and added the following:

[source::c:\\splunkinput\\my.log]
LINE_BREAKER = ^~\$

Unfortunately nothing happens and I have not yet figured out how to check what exactly is going when importing a new file into splunk.

The end result should be for every sequence (with carriage returns etc) between ~\ should be considered a new event.

Any tips?

P.s. is there a way to activate the props.conf changes without restarting splunkd?

Tags (2)

Lowell
Super Champion

I think you simply want

[mysourcetype]
LINE_BREAKER = (~\\)
# You may need to increase this (default 100)
LINE_BREAKER_LOOKBEHIND = 1000
SHOULD_LINEMERGE = false

There are two things to consider here: 1.) Splunk wants a matching group in the LINE_BREAKER, and 2.) I'm not sure it's valid to end a regex with the backslash (\) character. But I could be wrong.

I just re-read the question, and it sounds like you also want newlines to be split events. If that's correct, then try the following:

LINE_BREAKER = (~\\|[\r\n]+)
0 Karma

Megamuch
Engager

I have tried the following settings without success:

LINE_BREAKER = ~\\

LINE_BREAKER = ~\\^

LINE_BREAKER = ([~\\]+)

LINE_BREAKER = (.*)[~\\](.*)

LINE_BREAKER = .*~\\.*

An example string would be:

SMSEUCP_7110:STATUS:1049110|7116|7110|192.168.0.5
1180178|7112|7110|192.168.0.5
14156304|7111|7110|192.168.0.5
1180174|7117|7110|192.168.0.5
1180170|7119|7110|192.168.0.5
5767676|7113|7110|192.168.0.5
5308816|7114|7110|192.168.0.5
1573452|7115|7110|192.168.0.5
2426006|7118|7110|192.168.0.5
11141326|7110|7110|192.168.0.5~\SMSEMO_0000:S:(0000) Incoming : 3161234567 oh really? let do that then, ok?~\SMSEMO_0000:P:Posting : http://someurlwithparameters~\

The end result should be multiline events split by ~\ like so:

Event 1:

SMSEUCP_7110:STATUS:1049110|7116|7110|192.168.0.5
1180178|7112|7110|192.168.0.5
14156304|7111|7110|192.168.0.5
1180174|7117|7110|192.168.0.5
1180170|7119|7110|192.168.0.5
5767676|7113|7110|192.168.0.5
5308816|7114|7110|192.168.0.5
1573452|7115|7110|192.168.0.5
2426006|7118|7110|192.168.0.5
11141326|7110|7110|192.168.0.5

Event 2:

SMSEMO_0000:S:(0000) Incoming : 3161234567 oh really? let do that then, ok?

Event 3:

SMSEMO_0000:P:Posting : http://someurlwithparameters

I'm no regexp guru, but I thought this would be easier 😉

0 Karma

ftk
Motivator

I've updated my answer based on the sample data. If that doesnt work, try playing around with some other line breaking settings in props.conf: http://www.splunk.com/base/Documentation/latest/Admin/Propsconf

0 Karma

ftk
Motivator

In your regex you need to escape the backslash as such:

LINE_BREAKER = ^~\\$

If ~\ is not on a line by itself, drop the leading caret from your LINE_BREAKER definition:

LINE_BREAKER = ~\\$

I believe for event parsing configurations (such as LINE_BREAKER) you need to restart splunkd, however search time configurations (field extractions for example) in props.conf are applied automatically without having to restart Splunkd.

[EDIT Based on more info provided]

Based on the sample data, give the following a try in your props.conf:

[source::c:\\splunkinput\\my.log]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = false
MUST_BREAK_AFTER = ~\\

Lowell
Super Champion

Hmm, can you use ^ in LINE_BREAKER? I would think that you'd always need to use something like [\r\n]+ instead of ^ or $... Just my 2 cents.. And after re-reading all this info, I don't think you want to use end-of-string ($), start-of-string (^), or traditional-end-of-line ([\r\n]) stuff at all...

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...