I sent two events in JSON format to Splunk (Enterprise 6.4) via TCP. The second event was deliberately malformed: a string value was missing its closing quote.
The first event was successfully indexed. As expected, the second wasn't.
How do I troubleshoot this? For example, which Splunk log records the failure to ingest the second event?
If I send similarly malformed event data to the HTTP Event Collector (EC) as two events batched in a single request:
{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"good"}}
{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"bad}}
(note the deliberately missing closing quote after the bad value)
then, again, as expected, only the first event gets indexed. Unexpectedly, though, EC responds with:
{"text":"Success","code":0}
whereas, if I reverse the order of the JSON lines (putting the event with the bad value first), I get:
{"text":"Invalid data format","code":6,"invalid-event-number":0}
(For JSON parsing errors in EC input, I've seen that the data.num_of_parser_errors
metric in the _introspection
index for that time period gets incremented. But that's all the evidence I can see: I don't see the specific error details logged anywhere.)
I think I'll leave this question up for a few days longer as a testament to my own ignorance, and then delete it.
I might ask a new question later around similar issues, based on my recent, slightly better understanding. (For example, although much of my question is based on bogus assumptions, that HEC behavior I reported still looks dodgy to me.)
On with the self-flagellation:
I sent two events...
No, I didn't.
I sent two lines of JSON, each ending in \r\n
, but, in props.conf
, I had failed to specify SHOULD_LINEMERGE = false
. So the two lines were being treated as a single event.
If I had bothered to look at the _raw
field, I would have noticed that the JSON line with the "bad
(missing closing quote) value was appended to the "good" line, in a single event.
After adding SHOULD_LINEMERGE = false
and resending the data, I get two events. The first event has a myfield
value of good
. The second event has no myfield
value.
The first event was successfully indexed. As expected, the second wasn't.
My expectation was wrong.
As described above, after adding SHOULD_LINEMERGE = false
, the second event (with the missing closing quote) is indexed. It just doesn't have a myfield
value, because the JSON is malformed.