Getting Data In

HTTP Event Collector: Getting error "HttpInputDataHandler - Parsing error".

rphillips_splk
Splunk Employee
Splunk Employee

I have Splunk set up as an HTTP Event Collector receiver and am seeing parsing errors in splunkd.log like: ERROR HttpInputDataHandler - Parsing error.

How do I resolve these?

Labels (1)
1 Solution

rphillips_splk
Splunk Employee
Splunk Employee

You may see Parsing errors similar to the ones below , however the event does not show the client ip.

03-17-2020 12:55:50.841 -0400 ERROR HttpInputDataHandler - Parsing error : Got unexpected null element while expecting event's raw text, totalRequestSize=133

05-29-2020 12:36:34.333 -0400 ERROR HttpInputDataHandler - Parsing error : Event field cannot be blank

05-29-2020 12:35:32.005 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event's raw text: Unexpected character while looking for value: '}', totalRequestSize=40

05-29-2020 12:33:03.569 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object key: Unexpected character: 'e', totalRequestSize=66

You should group by the timestamp of the event as there will be 2 events logged (1 logging the parsing error and the other logging the response sent back to the client , which includes the client ip and the reply code).

For example:
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Parsing error : No data
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=5, events_processed=0, http_input_body_size=54

05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object to start: Unexpected character while looking for value: '\', totalRequestSize=69
05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=6, events_processed=1, http_input_body_size=69

grouping by _time and host will make these parsing errors easier to decipher:

index=_internal source=*splunkd.log HttpInputDataHandler ERROR | stats values(_raw) by _time host

Based on the reply code and client source_IP, you should examine the syntax of the request sent by that client or check the health of the HEC receiver.

HEC reply codes:
reply HttpInputReply status event_message
0 Success OK Success
1 TokenDisabled FORBIDDEN Token disabled
2 NoAuthorization UNAUTHORIZED Token is required
3 InvalidAuthorization UNAUTHORIZED Invalid authorization
4 TokenNotFound FORBIDDEN Invalid token
5 NoData BAD_REQUEST No data
6 InvalidData BAD_REQUEST Invalid data format
7 IncorrectIndex BAD_REQUEST Incorrect index
8 ServerError has been removed as it is not used anywhere
9 ServerBusy SERVICE_UNAVAILABLE Server is busy
10 NoChannel BAD_REQUEST Data channel is missing
11 InvalidChannel BAD_REQUEST Invalid data channel
12 NoEvent BAD_REQUEST Event field is required
13 BlankEvent BAD_REQUEST Event field cannot be blank
14 AckDisabled BAD_REQUEST ACK is disabled
15 UnsupportedType BAD_REQUEST Error in handling indexed fields
16 QueryStringAuthNotEnabled BAD_REQUEST Query string authorization is not enabled
17 HECHealthy OK HEC is healthy
18 QueuesFull SERVICE_UNAVAILABLE HEC is unhealthy, queues are full
19 AckUnavailable SERVICE_UNAVAILABLE HEC is unhealthy, ack service unavailable
20 QueuesFullAckUnavailable SERVICE_UNAVAILABLE Hec is unhealthy, queues are full, ack service unavailable

Note: My test was on Splunk 8.0.4 where the "response" event which includes the source_IP and reply fields are logged as log_level=ERROR

earlier versions of Splunk require setting the HttpInputDataHandler component into DEBUG to see these events:

ie: set on the HEC receiver Splunk instance:
./splunk set log-level HttpInputDataHandler -level DEBUG

then back to normal:
./splunk set log-level HttpInputDataHandler -level WARN

View solution in original post

rphillips_splk
Splunk Employee
Splunk Employee

or you could also run a search that is easier to digest:

index=_internal source=*splunkd.log* log_level=ERROR OR log_level=DEBUG component=HttpInputDataHandler reply=* | eval response_to_client=case(reply=="0","success",reply=="1","Token disabled",reply=="2","Token is required",reply=="3","Invalid authorization",reply=="4","Invalid token",reply=="5","No data",reply=="6","Invalid data format",reply=="7","Incorrect index",reply=="9","Server is busy",reply=="10","Data channel is missing",reply=="11","Invalid data channel",reply=="12","Event field is required",reply=="13","Event field cannot be blank",reply=="14","ACK is disabled",reply=="15","Error in handling indexed fields",reply=="16","Query string authorization is not enabled",reply=="17","HEC is healthy",reply=="18","HEC is unhealthy, queues are full",reply=="19","HEC is unhealthy, ack service unavailable",reply=="20","Hec is unhealthy, queues are full, ack service unavailable") | stats count by host name channel source_IP response_to_client reply | rename host as "HEC Receiver" source_IP as "HEC client"

rphillips_splk
Splunk Employee
Splunk Employee

You may see Parsing errors similar to the ones below , however the event does not show the client ip.

03-17-2020 12:55:50.841 -0400 ERROR HttpInputDataHandler - Parsing error : Got unexpected null element while expecting event's raw text, totalRequestSize=133

05-29-2020 12:36:34.333 -0400 ERROR HttpInputDataHandler - Parsing error : Event field cannot be blank

05-29-2020 12:35:32.005 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event's raw text: Unexpected character while looking for value: '}', totalRequestSize=40

05-29-2020 12:33:03.569 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object key: Unexpected character: 'e', totalRequestSize=66

You should group by the timestamp of the event as there will be 2 events logged (1 logging the parsing error and the other logging the response sent back to the client , which includes the client ip and the reply code).

For example:
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Parsing error : No data
05-29-2020 12:39:42.473 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=5, events_processed=0, http_input_body_size=54

05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Parsing error : While expecting event object to start: Unexpected character while looking for value: '\', totalRequestSize=69
05-29-2020 13:00:10.016 -0400 ERROR HttpInputDataHandler - Failed processing http input, token name=idx_cluster_token, channel=FE0ECFAD-13D5-401A-847D-77833DD77131, source_IP=10.140.49.235, reply=6, events_processed=1, http_input_body_size=69

grouping by _time and host will make these parsing errors easier to decipher:

index=_internal source=*splunkd.log HttpInputDataHandler ERROR | stats values(_raw) by _time host

Based on the reply code and client source_IP, you should examine the syntax of the request sent by that client or check the health of the HEC receiver.

HEC reply codes:
reply HttpInputReply status event_message
0 Success OK Success
1 TokenDisabled FORBIDDEN Token disabled
2 NoAuthorization UNAUTHORIZED Token is required
3 InvalidAuthorization UNAUTHORIZED Invalid authorization
4 TokenNotFound FORBIDDEN Invalid token
5 NoData BAD_REQUEST No data
6 InvalidData BAD_REQUEST Invalid data format
7 IncorrectIndex BAD_REQUEST Incorrect index
8 ServerError has been removed as it is not used anywhere
9 ServerBusy SERVICE_UNAVAILABLE Server is busy
10 NoChannel BAD_REQUEST Data channel is missing
11 InvalidChannel BAD_REQUEST Invalid data channel
12 NoEvent BAD_REQUEST Event field is required
13 BlankEvent BAD_REQUEST Event field cannot be blank
14 AckDisabled BAD_REQUEST ACK is disabled
15 UnsupportedType BAD_REQUEST Error in handling indexed fields
16 QueryStringAuthNotEnabled BAD_REQUEST Query string authorization is not enabled
17 HECHealthy OK HEC is healthy
18 QueuesFull SERVICE_UNAVAILABLE HEC is unhealthy, queues are full
19 AckUnavailable SERVICE_UNAVAILABLE HEC is unhealthy, ack service unavailable
20 QueuesFullAckUnavailable SERVICE_UNAVAILABLE Hec is unhealthy, queues are full, ack service unavailable

Note: My test was on Splunk 8.0.4 where the "response" event which includes the source_IP and reply fields are logged as log_level=ERROR

earlier versions of Splunk require setting the HttpInputDataHandler component into DEBUG to see these events:

ie: set on the HEC receiver Splunk instance:
./splunk set log-level HttpInputDataHandler -level DEBUG

then back to normal:
./splunk set log-level HttpInputDataHandler -level WARN

Get Updates on the Splunk Community!

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...