Solved: LINE_BREAKER in json files

SplunkDash · ‎12-28-2023

Hello,

Line breaker in my props configuration for the json formatted file is not working, it's not breaking the json events. My props and sample json events are giving below. Any recommendation will be highly appreciated, thank you!

props

[myprops]

CHARSET=UTF-8

KV_MODE-json

LINE_BREAKER=([\r\n]+)\"auditId\"\:

SHOULD_LINEMERGE=true

TIME_PREFIX="audittime": "

TIME_FORMAT=%Y-%m-%dT%H:%M:%S

TRUNCATE=9999



Sample Events

{
"items": [
{
"auditId" : 15067,
"secId": "mtt01",
"audittime": "2016-07-31T12:24:37Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
},
{
"auditId" : 16007,
"secId": "mtt01",
"audittime": "2016-07-31T12:23:47Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
},

{
"auditId" : 15165,
"secId": "mtt01",
"audittime": "2016-07-31T12:22:51Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
}
]

marnall · ‎12-28-2023

I recommend using the website https://regex101.com/ to test your regex and ensure it is definitely matching. When your regex is inserted, it does not seem to match the space character between "auditId" and the following colon (:)

I would also recommend splitting the json events so that they have the curly brackets like so:

{

"event1keys" : "event1values",

....

}

{

"event2keys" : "event2values",

....

}

Thus your LINE_BREAKER value should also match the opening curly brace and its newline, and its first capture group should include the discardable characters between events such as commas

LINE_BREAKER=(,?[\r\n]+){\s*\"auditId\"

I also recommend setting SHOULD_LINEMERGE to false to prevent Splunk from re-assembling multi-line events after the split.

View solution in original post

dtburrows3 · ‎12-28-2023

Testing this sample file on my local I think something like this could work.

[ <SOURCETYPE NAME> ]
...
LINE_BREAKER=([\r\n]+)\s*\{\s*[\r\n]+\s*\"auditId\"
TIME_FORMAT=%Y-%m-%dT%H:%M:%S
TIME_PREFIX=(?:.*[\r\n]+)*\"audittime\":\s*\"
SEDCMD-remove_trailing_comma=s/\,$//g
SEDCMD-remove_trailing_bracket=s/\][\r\n]+$//g
TRANSFORMS-remove_header=remove_json_header

This is a parsed event from the sampled file.

I am getting a warning about the timestamp, but this is not because it is unable to find it but because the datetime exceeds my set limit for MAX_DAYS_AGO/MAX_DAYS_HENCE.

Note the transform included in the props,

This is needed to remove the first part of the json file that the events are nested in.
There will need to be an accompanying stanza in transforms.conf specifying regex used to regognize the event to send to null queue. It probably would look something like this.

[remove_json_header]
REGEX = ^\s*\{\s*[\r\n]+\"items\":\s*\[
DEST_KEY = queue
FORMAT = nullQueue

marnall · ‎12-28-2023

I recommend using the website https://regex101.com/ to test your regex and ensure it is definitely matching. When your regex is inserted, it does not seem to match the space character between "auditId" and the following colon (:)

I would also recommend splitting the json events so that they have the curly brackets like so:

{

"event1keys" : "event1values",

....

}

{

"event2keys" : "event2values",

....

}

Thus your LINE_BREAKER value should also match the opening curly brace and its newline, and its first capture group should include the discardable characters between events such as commas

LINE_BREAKER=(,?[\r\n]+){\s*\"auditId\"

I also recommend setting SHOULD_LINEMERGE to false to prevent Splunk from re-assembling multi-line events after the split.

LINE_BREAKER in json files

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

LINE_BREAKER in json files

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers