Getting Data In

HTTP Event Collector: Getting error "HttpInputDataHandler - Parsing error".

rphillips_splk
Splunk Employee
Splunk Employee

I have Splunk set up as an HTTP Event Collector receiver and am seeing parsing errors in splunkd.log like: ERROR HttpInputDataHandler - Parsing error.

How do I resolve these?

Labels (1)
1 Solution

rphillips_splk
Splunk Employee
Splunk Employee

You may see Parsing errors similar to the ones below , however the event does not show the client ip.

03-17-2020 12:55:50.841 -0400 ERROR HttpInputDataHandler - Parsing error : Got unexpected null element while expecting event's raw text, totalRequestSize=133

05-29-2020 12:36:34.333 -0400 ERROR HttpInputDataHandler - Parsing error : Event field cannot be blank

05-29-2020 12:35:32.005 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event's raw text: Unexpected character while looking for value: '}', totalRequestSize=40

05-29-2020 12:33:03.569 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object key: Unexpected character: 'e', totalRequestSize=66

You should group by the timestamp of the event as there will be 2 events logged (1 logging the parsing error and the other logging the response sent back to the client , which includes the client ip and the reply code).

For example:
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Parsing error : No data
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=5, events_processed=0, http_input_body_size=54

05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object to start: Unexpected character while looking for value: '\', totalRequestSize=69
05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=6, events_processed=1, http_input_body_size=69

grouping by _time and host will make these parsing errors easier to decipher:

index=_internal source=*splunkd.log HttpInputDataHandler ERROR | stats values(_raw) by _time host

Based on the reply code and client source_IP, you should examine the syntax of the request sent by that client or check the health of the HEC receiver.

HEC reply codes:
reply HttpInputReply status event_message
0 Success OK Success
1 TokenDisabled FORBIDDEN Token disabled
2 NoAuthorization UNAUTHORIZED Token is required
3 InvalidAuthorization UNAUTHORIZED Invalid authorization
4 TokenNotFound FORBIDDEN Invalid token
5 NoData BAD_REQUEST No data
6 InvalidData BAD_REQUEST Invalid data format
7 IncorrectIndex BAD_REQUEST Incorrect index
8 ServerError has been removed as it is not used anywhere
9 ServerBusy SERVICE_UNAVAILABLE Server is busy
10 NoChannel BAD_REQUEST Data channel is missing
11 InvalidChannel BAD_REQUEST Invalid data channel
12 NoEvent BAD_REQUEST Event field is required
13 BlankEvent BAD_REQUEST Event field cannot be blank
14 AckDisabled BAD_REQUEST ACK is disabled
15 UnsupportedType BAD_REQUEST Error in handling indexed fields
16 QueryStringAuthNotEnabled BAD_REQUEST Query string authorization is not enabled
17 HECHealthy OK HEC is healthy
18 QueuesFull SERVICE_UNAVAILABLE HEC is unhealthy, queues are full
19 AckUnavailable SERVICE_UNAVAILABLE HEC is unhealthy, ack service unavailable
20 QueuesFullAckUnavailable SERVICE_UNAVAILABLE Hec is unhealthy, queues are full, ack service unavailable

Note: My test was on Splunk 8.0.4 where the "response" event which includes the source_IP and reply fields are logged as log_level=ERROR

earlier versions of Splunk require setting the HttpInputDataHandler component into DEBUG to see these events:

ie: set on the HEC receiver Splunk instance:
./splunk set log-level HttpInputDataHandler -level DEBUG

then back to normal:
./splunk set log-level HttpInputDataHandler -level WARN

View solution in original post

rphillips_splk
Splunk Employee
Splunk Employee

or you could also run a search that is easier to digest:

index=_internal source=*splunkd.log* log_level=ERROR OR log_level=DEBUG component=HttpInputDataHandler reply=* | eval response_to_client=case(reply=="0","success",reply=="1","Token disabled",reply=="2","Token is required",reply=="3","Invalid authorization",reply=="4","Invalid token",reply=="5","No data",reply=="6","Invalid data format",reply=="7","Incorrect index",reply=="9","Server is busy",reply=="10","Data channel is missing",reply=="11","Invalid data channel",reply=="12","Event field is required",reply=="13","Event field cannot be blank",reply=="14","ACK is disabled",reply=="15","Error in handling indexed fields",reply=="16","Query string authorization is not enabled",reply=="17","HEC is healthy",reply=="18","HEC is unhealthy, queues are full",reply=="19","HEC is unhealthy, ack service unavailable",reply=="20","Hec is unhealthy, queues are full, ack service unavailable") | stats count by host name channel source_IP response_to_client reply | rename host as "HEC Receiver" source_IP as "HEC client"

rphillips_splk
Splunk Employee
Splunk Employee

You may see Parsing errors similar to the ones below , however the event does not show the client ip.

03-17-2020 12:55:50.841 -0400 ERROR HttpInputDataHandler - Parsing error : Got unexpected null element while expecting event's raw text, totalRequestSize=133

05-29-2020 12:36:34.333 -0400 ERROR HttpInputDataHandler - Parsing error : Event field cannot be blank

05-29-2020 12:35:32.005 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event's raw text: Unexpected character while looking for value: '}', totalRequestSize=40

05-29-2020 12:33:03.569 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object key: Unexpected character: 'e', totalRequestSize=66

You should group by the timestamp of the event as there will be 2 events logged (1 logging the parsing error and the other logging the response sent back to the client , which includes the client ip and the reply code).

For example:
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Parsing error : No data
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=5, events_processed=0, http_input_body_size=54

05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object to start: Unexpected character while looking for value: '\', totalRequestSize=69
05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=6, events_processed=1, http_input_body_size=69

grouping by _time and host will make these parsing errors easier to decipher:

index=_internal source=*splunkd.log HttpInputDataHandler ERROR | stats values(_raw) by _time host

Based on the reply code and client source_IP, you should examine the syntax of the request sent by that client or check the health of the HEC receiver.

HEC reply codes:
reply HttpInputReply status event_message
0 Success OK Success
1 TokenDisabled FORBIDDEN Token disabled
2 NoAuthorization UNAUTHORIZED Token is required
3 InvalidAuthorization UNAUTHORIZED Invalid authorization
4 TokenNotFound FORBIDDEN Invalid token
5 NoData BAD_REQUEST No data
6 InvalidData BAD_REQUEST Invalid data format
7 IncorrectIndex BAD_REQUEST Incorrect index
8 ServerError has been removed as it is not used anywhere
9 ServerBusy SERVICE_UNAVAILABLE Server is busy
10 NoChannel BAD_REQUEST Data channel is missing
11 InvalidChannel BAD_REQUEST Invalid data channel
12 NoEvent BAD_REQUEST Event field is required
13 BlankEvent BAD_REQUEST Event field cannot be blank
14 AckDisabled BAD_REQUEST ACK is disabled
15 UnsupportedType BAD_REQUEST Error in handling indexed fields
16 QueryStringAuthNotEnabled BAD_REQUEST Query string authorization is not enabled
17 HECHealthy OK HEC is healthy
18 QueuesFull SERVICE_UNAVAILABLE HEC is unhealthy, queues are full
19 AckUnavailable SERVICE_UNAVAILABLE HEC is unhealthy, ack service unavailable
20 QueuesFullAckUnavailable SERVICE_UNAVAILABLE Hec is unhealthy, queues are full, ack service unavailable

Note: My test was on Splunk 8.0.4 where the "response" event which includes the source_IP and reply fields are logged as log_level=ERROR

earlier versions of Splunk require setting the HttpInputDataHandler component into DEBUG to see these events:

ie: set on the HEC receiver Splunk instance:
./splunk set log-level HttpInputDataHandler -level DEBUG

then back to normal:
./splunk set log-level HttpInputDataHandler -level WARN

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...