Getting Data In

Where does Splunk log errors about malformed JSON input data?

Graham_Hanningt
Builder

I sent two events in JSON format to Splunk (Enterprise 6.4) via TCP. The second event was deliberately malformed: a string value was missing its closing quote.

The first event was successfully indexed. As expected, the second wasn't.

How do I troubleshoot this? For example, which Splunk log records the failure to ingest the second event?

If I send similarly malformed event data to the HTTP Event Collector (EC) as two events batched in a single request:

{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"good"}}
{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"bad}}

(note the deliberately missing closing quote after the bad value)

then, again, as expected, only the first event gets indexed. Unexpectedly, though, EC responds with:

{"text":"Success","code":0}

whereas, if I reverse the order of the JSON lines (putting the event with the bad value first), I get:

{"text":"Invalid data format","code":6,"invalid-event-number":0}

(For JSON parsing errors in EC input, I've seen that the data.num_of_parser_errors metric in the _introspection index for that time period gets incremented. But that's all the evidence I can see: I don't see the specific error details logged anywhere.)

Graham_Hanningt
Builder

I think I'll leave this question up for a few days longer as a testament to my own ignorance, and then delete it.

I might ask a new question later around similar issues, based on my recent, slightly better understanding. (For example, although much of my question is based on bogus assumptions, that HEC behavior I reported still looks dodgy to me.)

On with the self-flagellation:

I sent two events...

No, I didn't.

I sent two lines of JSON, each ending in \r\n, but, in props.conf, I had failed to specify SHOULD_LINEMERGE = false. So the two lines were being treated as a single event.

If I had bothered to look at the _raw field, I would have noticed that the JSON line with the "bad (missing closing quote) value was appended to the "good" line, in a single event.

After adding SHOULD_LINEMERGE = false and resending the data, I get two events. The first event has a myfield value of good. The second event has no myfield value.

The first event was successfully indexed. As expected, the second wasn't.

My expectation was wrong.

As described above, after adding SHOULD_LINEMERGE = false, the second event (with the missing closing quote) is indexed. It just doesn't have a myfield value, because the JSON is malformed.

Get Updates on the Splunk Community!

Platform Newsletter Highlights | March 2023

 March 2023 | Check out the latest and greatestIntroducing Splunk Edge Processor, simplified data ...

Enterprise Security Content Updates (ESCU) - New Releases

In the last month, the Splunk Threat Research Team (STRT) has had 3 releases of new content via the Enterprise ...

Thought Leaders are Validating Your Hard Work and Training Rigor

As a Splunk enthusiast and member of the Splunk Community, you are one of thousands who recognize the value of ...