I have a json-File with with 23.904 objects in it. They are all like:
{
"1.Entry": "1.Data",
...
"44.Entry": "44.Data"
},
... 23.902 similiar entries...
{
"1.Entry": "1.Data",
...
"44.Entry": "44.Data"
}
But forwarding the json-file leaded to the count of 22.256 events (presents 22.256 json-objects)
My props.conf
[json_test]
DATETIME_CONFIG =
TIMESTAMP_FIELDS = test.sys_created_on
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Structured
description = test json
disabled = false
pulldown_type = true
The problem so is not that a single event is truncated, but the json-file is.
Hello again,
my last entry
"i've parsed my InputFile (json-parser) and before one of the missing event there is an error, like unexpected non-white-space sign.
So i think, it is not a problem of splunk!
" was a wrong result. I've made a mistake in my investigation.
So i tried the programm jq (ubuntu-linux) to validate the whole json-file.
Surprise - there is no failure in the json-file. I've checked the json-file in the forwarder-directory.
So i guess there is a sign in the data, that splunk "misunderstand" and break the json-structure.
There is something not right about this. If your events are indeed formed this way (a multiline entries) and your LINE_BREAKER is set to ([\r\n]+) there is no way they are ingested as a whole.
Tell us more about how you are ingesting it (and if you're reading a file with a forwarder, show us the relevant inputs.conf stanza and props.conf stanza from the forwarder).
Thank you for your questions @PickleRick .
I'm using forwarding mechanismen.
Here are the stanzas form the forwarder:
inputs.conf
[monitor:///daten/datasources/data/mg_test/entry2group/*.json]
disabled = false
index = mg_test
sourcetype = json_test
crcSalt = <SOURCE>
whitelist = .*\d{8}_Q\d_entry_entry2group\.v\d\.(\d\d\.){2}json$
[json_test]
DATETIME_CONFIG =
TIMESTAMP_FIELDS = test.sys_created_on
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Structured
description = test json
disabled = false
pulldown_type = true
I've copied this props.conf from my first try to upload (over splunk-web).
Here is the stanza from ../etc/system/local/props.conf
[test_json]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
TIMESTAMP_FIELDS = test.sys_created_on
category = Structured
description = JavaScript Object Notation format. For more information, visit http://json.org/
disabled = false
pulldown_type = true
Another investigation shows me, you are on the right way!
I found following event on _internal.
08-25-2024 19:31:28.338 +0200 ERROR JsonLineBreaker [1737739 structuredparsing] - JSON StreamId:1586716756715697390 had parsing error:Unexpected character while looking for value: ',' - data_source="daten/datasources/data/mg_test/entry2group/20240825_Q2_entry_entry2group.v0.03.01
.json]", data_host="socmg_local_fw", data_sourcetype="json_test"
So in the next step i will isolate one event (object) which is lost if there are special sign in the data.
i've parsed my InputFile (json-parser) and before one of the missing event there is an error, like unexpected non-white-space sign.
So i think, it is not a problem of splunk!
Further investigation:
I shortened the json-objects from 44 to 43 lines.
{
"1.Entry": "1.Data",
...
"43.Entry": "43.Data"
},
... 48.186 similiar entries...
{
"1.Entry": "1.Data",
...
"43.Entry": "43.Data"
}
But forwarding the json-file leaded to the count of 45.352 events (presents 45.352 json-objects), instead of 48.188 objects.
That's a little bit 'loco' i think.
hi @a101755,
Try adding below configs in input monitors in inputs.conf.
crcSalt = <SOURCE>
initCrcLength = 2048
Thank you @manjunathmeti .
But it doesn't function. The result is the same as before.
I think your advice helps if splunk doesn't import a whole file, if it is not salted and/or the first characters in it doesn't have a difference to another file imported before.
Further Investigation:
I have exported the items from splunk (csv) and compare the original file with the export.
I can't see any muster, which object is imported and which not. A muster could be like the first 22.256 objects were importet,
I see, that object 66 to is not imported, 104, 108, 113, and so on not imported.
I think there is a limit to import json-objects. But which is it?