Solved: Re: Why this error after upgrade to 9.0 "ERROR Tc...

hrawat · ‎07-07-2022

After upgrade to 9.0 seeing following

ERROR TcpOutputQ [<thread id> TcpOutEloop] - Unexpected event id=<eventid>

hrawat · ‎07-07-2022

If useACK set to true and batch mode is on(default on) with Splunk 9.0, there is a possibility of hitting following error log messages.

"Unexpected event id"
"Invalid ACK received from indexer"
"Got unexpected ACK with eventid"

This may also lead to blocked queues on forwarding tier.

autoLBVolume and autoBatch while processing an event, apply limit using raw size of the event. However if there are raw less events ( e.g. metrics events) autoLBVolume and autoBatch will end up sending lot more events then configured limits to receiver.
With autoLBVolume, it results in more than expected/configured events distributed to receivers.

With autoBatch, it results in batch of lot more events than expected. That means while a batch of thousands of events being sent to receiver, at the same time some events are already getting acknowledged.
Forwarder creates a list of events to be acknowledged after successfully sending batch of events. However if the batch is in-flight over TCP layer and forwarder receives an ACKed event of the batch, it's not in the list of expected events to be acknowledged. That leads to above ERROR.

Workaround: Either set useACK=false or autoBatch=false

Issue is fixed by 9.0.3 patch.

Note:
After 9.0.3 upgrade, you will still see benign “Unexpected event id” log message. However there should not be following log messages.
"Invalid ACK received from indexer"
"Got unexpected ACK with eventid"

View solution in original post

vinayakwagh · ‎07-15-2022

It helps Thanks

hrawat · ‎07-07-2022

If useACK set to true and batch mode is on(default on) with Splunk 9.0, there is a possibility of hitting following error log messages.

"Unexpected event id"
"Invalid ACK received from indexer"
"Got unexpected ACK with eventid"

This may also lead to blocked queues on forwarding tier.

autoLBVolume and autoBatch while processing an event, apply limit using raw size of the event. However if there are raw less events ( e.g. metrics events) autoLBVolume and autoBatch will end up sending lot more events then configured limits to receiver.
With autoLBVolume, it results in more than expected/configured events distributed to receivers.

With autoBatch, it results in batch of lot more events than expected. That means while a batch of thousands of events being sent to receiver, at the same time some events are already getting acknowledged.
Forwarder creates a list of events to be acknowledged after successfully sending batch of events. However if the batch is in-flight over TCP layer and forwarder receives an ACKed event of the batch, it's not in the list of expected events to be acknowledged. That leads to above ERROR.

Workaround: Either set useACK=false or autoBatch=false

Issue is fixed by 9.0.3 patch.

Note:
After 9.0.3 upgrade, you will still see benign “Unexpected event id” log message. However there should not be following log messages.
"Invalid ACK received from indexer"
"Got unexpected ACK with eventid"

anandhalagaras1 · ‎02-08-2024

@lawrence_magpoc ,

I am running with Splunk Universal Forwarder 9.0.2 in one of my Linux client machine and recently for the past couple of days i am getting this events in the internal logs and it seems like its getting crashed and once again the service is getting started automatically.

[build 17e00c557dc1] 2024-02-08 05:26:15 Received fatal signal 6 (Aborted) on PID 1908113. Cause: Signal sent by PID 1908113 running under UID 9991. Crashing thread: TcpOutEloop Registers: RIP: [0x00007F65EB39AACF] gsignal + 271 (libc.so.6 + 0x4EACF)

ERROR TcpOutputQ [1908232 TcpOutEloop] - Unexpected event id=30

ERROR TcpOutputQ [1908232 TcpOutEloop] - Unexpected event id=29

So how to fix this issue and also in which config file we need to add in the client machine where UF is running.

autoBatch=false

hrawat · ‎02-08-2024

In outputs.conf set

useACK=false

autoBatch=false

woodcock · ‎11-01-2022

It is back in v9.0.1

hrawat · ‎11-01-2022

See my updated answer. 9.0.1 still logs the ERROR, but it does not block forwarder.

Sithima · ‎10-05-2022

If the issue is fixed in 9.0.1, why am I getting the same error message in Splunk 9.0.1?

ERROR TcpOutputQ [<id> TcpOutEloop] - Unexpected eventid=<id>

hrawat · ‎10-06-2022

9.0.1 has not suppressed the ERROR log. It fixes the underlying tcpout queue blockage issue. While you see the ERROR log but no tcpout queue blockage (as seen with 9.0.0) is an indication that the tcpout queue blockage issue is resolved.

Will suppress 9.0.1 benign ERROR log in future releases.

woodcock · ‎11-01-2022

Actually the problem is still there. I was getting continuous crashes on my HWF.

hrawat · ‎11-01-2022

That crash is still an issue and will be fixed. It happens if forceTimebasedAutoLB=true
Workaround for 9.0.1 TcpOutputQ crash
Set one of the following

forceTimebasedAutoLB=false

or

autoBatch=false

or

connectionsPerTarget=1

This crash is applicable if UF/HF resolves < 10 target IP addresses and forceTimebasedAutoLB=true.

pmerlin1 · ‎12-22-2023

The issue is not fixed after upgrading 9.1.2. This issue occured on search head cluster.

My settings in outputs.conf :

[indexer_discovery:target_master]
pass4SymmKey = **********

[tcpout]
defaultGroup = default_indexers
forceTimebasedAutoLB = true
maxQueueSize = 7MB
useACK = true

[tcpout:default_indexers]
server = **********01:9997,**********02.lan:9997

hrawat · ‎12-22-2023

Following three logs

"Unexpected event id" ( 9.1.2 still logs)
"Invalid ACK received from indexer" ( 9.1.2 should not log)
"Got unexpected ACK with eventid" (9.1.2 should not log)

What exactly the issue you are hitting?

Vwagh · ‎12-22-2023

It needs to be

UseAck = false

Then this errors should resolve.

pmerlin1 · ‎02-08-2024

useACK = <boolean>
* Whether or not to use indexer acknowledgment.
* Indexer acknowledgment is an optional capability on forwarders that helps
  prevent loss of data when sending data to an indexer.

the workaround means you don't need use the indexer aknowledgment, so you run the risk of losing data during an indexer restart. The solution is not suitable for me.

hrawat · ‎02-08-2024

Upgrade to 9.0.3 and above.

pmerlin1 · ‎02-08-2024

The message appeared for forwarders (search head cluster) upgraded in 9.1.2

settings:

forceTimebasedAutoLB = false

useACK = true

autoLBFrequency = 30

Upgrading don't change the behavior on full enterprise splunk

But this seems to work on Universal Forwarder

hrawat · ‎02-09-2024

Can you let us know which log you see?

Following three logs

"Unexpected event id" ( 9.1.2 still logs)
"Invalid ACK received from indexer" ( 9.1.2 should not log)
"Got unexpected ACK with eventid" (9.1.2 should not log)

pmerlin1 · ‎02-12-2024

The message still logged is :

"Unexpected event id"

hrawat · ‎02-14-2024

This is expected and benign ERROR. We will change log to INFO in future.

Why this error after upgrade to 9.0 "ERROR TcpOutputQ [<thread id> TcpOutEloop] - Unexpected event id=<eventid>"?

heavy forwarder

intermediate forwarder

universal forwarder

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?