Getting Data In

Where does Splunk log errors about malformed JSON input data?

Graham_Hanningt
Builder

I sent two events in JSON format to Splunk (Enterprise 6.4) via TCP. The second event was deliberately malformed: a string value was missing its closing quote.

The first event was successfully indexed. As expected, the second wasn't.

How do I troubleshoot this? For example, which Splunk log records the failure to ingest the second event?

If I send similarly malformed event data to the HTTP Event Collector (EC) as two events batched in a single request:

{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"good"}}
{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"bad}}

(note the deliberately missing closing quote after the bad value)

then, again, as expected, only the first event gets indexed. Unexpectedly, though, EC responds with:

{"text":"Success","code":0}

whereas, if I reverse the order of the JSON lines (putting the event with the bad value first), I get:

{"text":"Invalid data format","code":6,"invalid-event-number":0}

(For JSON parsing errors in EC input, I've seen that the data.num_of_parser_errors metric in the _introspection index for that time period gets incremented. But that's all the evidence I can see: I don't see the specific error details logged anywhere.)

Graham_Hanningt
Builder

I think I'll leave this question up for a few days longer as a testament to my own ignorance, and then delete it.

I might ask a new question later around similar issues, based on my recent, slightly better understanding. (For example, although much of my question is based on bogus assumptions, that HEC behavior I reported still looks dodgy to me.)

On with the self-flagellation:

I sent two events...

No, I didn't.

I sent two lines of JSON, each ending in \r\n, but, in props.conf, I had failed to specify SHOULD_LINEMERGE = false. So the two lines were being treated as a single event.

If I had bothered to look at the _raw field, I would have noticed that the JSON line with the "bad (missing closing quote) value was appended to the "good" line, in a single event.

After adding SHOULD_LINEMERGE = false and resending the data, I get two events. The first event has a myfield value of good. The second event has no myfield value.

The first event was successfully indexed. As expected, the second wasn't.

My expectation was wrong.

As described above, after adding SHOULD_LINEMERGE = false, the second event (with the missing closing quote) is indexed. It just doesn't have a myfield value, because the JSON is malformed.

Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...