All Apps and Add-ons

What could be causing intermittent "NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id" messages

Explorer

Hello,

I recently set up splunk stream to receive netflow v9 data from a few sources. Everything seems to be working fine so far, but every so often I'll start getting these messages in my streamfwd log, which will last few several minutes and then go away again, only to return several minutes later.

2017-09-12 15:48:49 WARN  [140371258496768] (NetflowManager/NetflowDecoder.cpp:1112) stream.NetflowReceiver - NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id 259 received for observation domain id 768 from device x.x.x.x . Dropping flow data set of size 56
2017-09-12 15:48:50 WARN  [140371258496768] (NetflowManager/NetflowDecoder.cpp:1112) stream.NetflowReceiver - NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id 259 received for observation domain id 768 from device x.x.x.x . Dropping flow data set of size 212
2017-09-12 15:48:51 WARN  [140371258496768] (NetflowManager/NetflowDecoder.cpp:1112) stream.NetflowReceiver - NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id 259 received for observation domain id 768 from device x.x.x.x . Dropping flow data set of size 160
2017-09-12 15:48:54 WARN  [140371258496768] (NetflowManager/NetflowDecoder.cpp:1112) stream.NetflowReceiver - NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id 259 received for observation domain id 768 from device x.x.x.x . Dropping flow data set of size 372
2017-09-12 15:48:57 WARN  [140371258496768] (NetflowManager/NetflowDecoder.cpp:1112) stream.NetflowReceiver - NetFlowDecoder::decodeFlow Unable to decode flow set data. No template with id 259 received for observation domain id 768 from device x.x.x.x . Dropping flow data set of size 108

What could be causing these messages to intermittently appear like that? I thought that this could be due to a netflow template not being sent (cisco devices are sending the netflow data), but I don't think that this is the case since this only happens intermittently.

In case it would help, my streamfwd.conf file contains the following lines:

[streamfwd]
logConfig = streamfwdlog.conf
port = 8889

netflowReceiver.0.ip = x.x.x.x
netflowReceiver.0.port = 9995
netflowReceiver.0.decoder = netflow

New Member

I am running into this issue on multiple setups. I increased IPFIX template interval to 1 second even then I see error message "No template with id 677 received for observation domain id 262400"
Any solution?
Is splunk timing out IPFIX templates?

0 Karma

Splunk Employee
Splunk Employee

hello @lacrosse1991,

have you tried setting the template broadcast interval on the Cisco device to a lower value (like every few seconds)? Are there any other errors/warnings in the log?

0 Karma

Explorer

Hi @vshcherbakov_splunk,

I did try changing the broadcast interval to 30 seconds, but the messages kept showing up, although this time they only appeared for a few seconds at a time. Unfortunately I could not find any other error messages in the logs.

0 Karma

Explorer

actually I did find the following: https://imgur.com/a/Xcnow

0 Karma

Splunk Employee
Splunk Employee

Do the socket error messages get logged streamfwd.log or one of splunk*.log files?

0 Karma

Explorer

that was found in splunkd.log

0 Karma

Splunk Employee
Splunk Employee

Does stream TA get restarted by any chance?

0 Karma

Explorer

not that I can find unfortunately. Do you think it would be worth it to try reinstalling splunk? I installed using the tar.gz file, might trying using the deb file install instead

0 Karma

Splunk Employee
Splunk Employee

It's hard to tell if reinstalling splunk would help, but I doubt .deb vs. tar.gz by itself would make a difference. Any chance of the network data loss between the netflow generating device and stream instance? Also, what's the overall volume of netflow data you're ingesting?

0 Karma

Explorer

ok. I'll resort to that option last then.

At the moment I'm testing on a lab device, so there may be something that's causing an interruption. I'll try pointing a production device at this box instead and see if anything behaves differently.

The lab device is sending out a very small amount of traffic, around 3000 netflow events per hour. The production device will more than likely end up being much higher.

0 Karma

Explorer

I tried two production devices, and unfortunately both of them are exhibiting the same behavior. You can see here https://imgur.com/a/TudnM where the events sharply cut off during the error message periods. I also noticed that the devices won't all start displaying the messages at the same time, sometimes only one device will do this, while other times both devices will trigger those messages. The peak input for the devices is around 9000 events a minute.

0 Karma