Getting Data In

Regex for Line Break props.conf

nathanluke86
Communicator

Hello Splunkers,

Any Regex geniuses that can help line break the below logs.
alt text

Ideally remove the text in the red and line break where highlighted yellow

TIA

0 Karma
1 Solution

oscar84x
Contributor

Is it possible for you to provide an actual sample of the data? Delete or replace any user data.
It's difficult to figure out without knowing where there are blank spaces or carriage returns.

The settings you're looking to use in props are LINE_BREAKER and SEDCMD. Something like:

LINE_BRAKER = ([{}\,\s]+)"allowed" <-- this would start each event with "allowed" and get rid of the characters between ()
SEDCMD-null = s/{|}|"netflows":\s+[//g <-- This will get rid of the header line as well as any lingering single curly braces

You can play around with the REGEX and those settings and find what works for your desired outcome. But you could share some actual data structure and we can refine it.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Try these props.conf settings.

[mysourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\s+{
SEDCMD-netflows = s/{\s+"netflows": \[//

P.S. Posting text instead of an image makes it easier for us to test regular expressions with your data.

---
If this reply helps you, Karma would be appreciated.
0 Karma

nathanluke86
Communicator

TIA will try both suggestions. I have added txt above

0 Karma

oscar84x
Contributor

Is it possible for you to provide an actual sample of the data? Delete or replace any user data.
It's difficult to figure out without knowing where there are blank spaces or carriage returns.

The settings you're looking to use in props are LINE_BREAKER and SEDCMD. Something like:

LINE_BRAKER = ([{}\,\s]+)"allowed" <-- this would start each event with "allowed" and get rid of the characters between ()
SEDCMD-null = s/{|}|"netflows":\s+[//g <-- This will get rid of the header line as well as any lingering single curly braces

You can play around with the REGEX and those settings and find what works for your desired outcome. But you could share some actual data structure and we can refine it.

nathanluke86
Communicator

{
"netflows": [
{
"allowed_domain": [
"xxxxxxxxxxxx"
],
"create_time": "2020-01-28T14:35:01.919766",
"direction": "DIRECTION_REMOTE_INITIATED",
"end_time": "2020-01-28T14:42:14.431033",
"endpoint_platform": "xxxxx",
"event_hostname": "xxxxxxx",
"id": {
"fragment_id": 7456039343514739067,
"host_id": "xxxxxxxxxxx",
"instance_id": "xxxxxxxxxxxxx",
"timestamp": "2020-01-28T15:29:50.785488"
},
"local_ip": "xxxxxxx",
"local_port": xxxxx,
"process_id": {
"host_id": "xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"pid": 1748,
"time_window": 13224155566
},
"protocol": "PROTOCOL_UDP",
"remote_ip": "xxxxxxxx",
"remote_port": xxxx,
"rx_bytes": 44,
"unique_timestamp": "2020-01-28T15:29:50.785488-f5ab3ba0c8c13db2"
},
{
"allowed_domain": [
"28fea4ba"
],
"create_time": "2020-01-28T14:34:57.648822",
"direction": "DIRECTION_REMOTE_INITIATED",
"end_time": "2020-01-28T14:42:11.299711",
"endpoint_platform": "xxxxxxxxx",
"event_hostname": "xxxx",
"id": {
"fragment_id": xxxxxxxxxxxxxxx,
"host_id": "xxxxxxxxxxxxxxxxxxxxxx",
"instance_id": "xxxxxxxxxxx",
"timestamp": "2020-01-28T15:29:50.785295"
},
"local_ip": "xxxxxxxxxx",
"local_port": xxxxxxx,
"process_id": {
"host_id": "xxxxxxxxxxxxxxxxxxxxxx",
"pid": xxxxx0,
"time_window": 13224155688
},
"protocol": "PROTOCOL_UDP",
"remote_ip": "xxxxxxxxxx",
"remote_port": xxxxxxxxxxxxxxx,
"rx_bytes": 696,
"unique_timestamp": "2020-01-28T15:29:50.x"
},
{
"allowed_domain": [
"x"
],
"create_time": "2020-01-28T14:34:59.348932",
"direction": "DIRECTION_REMOTE_INITIATED",
"end_time": "2020-01-28T14:42:10.980602",
"endpoint_platform": "x,
"event_hostname": "x",
"id": {
"fragment_id": x,
"host_id": "2xxxa1",
"instance_id": "dxxxx",
"timestamp": "2020-01-28T15:29:50.783380"
},
"local_ip": "fx3",
"local_port": x,
"process_id": {
"host_id": "x",
"pid": 1x8,
"time_window": 13224155566
},
"protocol": "PROTOCOL_UDP",
"remote_ip": "fxxxxxxx",
"remote_port": x,
"rx_bytes": 44,
"unique_timestamp": "2020-01-28T15:29:50.783380-997aae15b7991f4a"
},

0 Karma

nathanluke86
Communicator

TIA much appreciated

0 Karma

oscar84x
Contributor

Great. From what you shared I got 3 events starting with "allowed_domain" and ending with "unique_timestamp". Also got rid of the header. Try this:

[your_sourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=([{}\,\s]+)"allowed
NO_BINARY_CHECK=true
SEDCMD-null=s/^{\s+|"netflows": [//g

0 Karma

oscar84x
Contributor

Any luck?

0 Karma

nathanluke86
Communicator

Thanks @oscar84x

I am having issues with the Splunk Addon builder app doing a rest API call. When I solve this issue I can test this properly and let you know.

Thanks for the help so far.

0 Karma

oscar84x
Contributor

No problem. If it works please don't forget to accept it as an answer, thank you.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...