I have input files from MS Graph with pretty-printed JSON that looks something like the following (ellipses used liberally...). I am unable to find the right LINE_BREAKER value or BREAK_ONLY_BEFORE or BREAK_ONLY_AFTER to split the records on the comma between the }, and the {. Note that this sample has had the indentation with extra spaces removed.
[
{
"@odata.type": "#Microsoft.graph....",
"id": "...",
"...": "...",
"foobar": {
"foo1": "bar1",
"foo2": "bar2",
},
"...": "...",
"barfoo": {
"bar1": "foo1",
"bar2": "foo2",
}
},
{
"@odata.type": "#Microsoft.graph....",
"id": "...",
"...": "...",
},
{
"...": "...",
"...": "...",
}
]
This props.conf fails, because there are other }, strings within each record (see end of "foobar"):
[json]
TRUNCATE = 0
KV_MODE = json
TIME_PREFIX = \"xxxEventDateTime\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6N
MUST_BREAK_AFTER = \s*}\,
This is for a 6.5.x Splunk hwf feeding a 6.5.x indexer cluster.
I found the following to work:
TRUNCATE = 0
SHOULD_LINEMERGE = false
PREAMBLE_REGEX = ^\s*\[\s*$
LINE_BREAKER = }(,\s*[\r\n]*\s*){
I found the following to work:
TRUNCATE = 0
SHOULD_LINEMERGE = false
PREAMBLE_REGEX = ^\s*\[\s*$
LINE_BREAKER = }(,\s*[\r\n]*\s*){