All Apps and Add-ons

Splunk App for Stream: Is there a way to diagnose if a network interface is working as expected or pushing its limits and dropping packets?

gesman
Communicator

I have two identical Linux appliances with "capture software A" installed on appliance 1 and Splunk Stream installed on appliance 2.
They both connected in an identical way to datacenter switch TAP ports to monitor and capture HTTP traffic.

What i found is that "capture software A" captures packets: 1,2,3,4,5
while Splunk Stream captures packets 1,2,4,5

I am not excluding the fact that packet "3" might have never arrived to "appliance 2" or may be dropped by ETH interface before reaching splunk stream layer.
Is there a way to diagnose the fact that network interface is working as expected or perhaps pushing its limits and dropping packets?

Would be great to have some way to diagnose the reliability and completeness of capturing process as we're dealing with online banking portal here.

Gleb

Tags (1)
1 Solution

mdickey_splunk
Splunk Employee
Splunk Employee

Approximately how much traffic (bps) are you capturing, and do you see any WARN or ERROR messages in index=_internal sourcetype=stream:log such as "Max packet queue size exceeded?"

Stream tracks an internal metric called DroppedPackets that it records to index=_internal sourcetype=stream:stats. This represents the number of packets received by the network interface but not processed. You can get a report on this using the following search:

index=_internal sourcetype=stream:stats | spath Output=DroppedPackets path=sniffer{}.captures{}.droppedPackets | eventstats sum(DroppedPackets) by _cd | rename sum(DroppedPackets) as SumDroppedPackets | streamstats current=t global=f window=2 earliest(SumDroppedPackets) as prev latest(SumDroppedPackets) as curr by host | eval delta=curr-prev | eval absdelta=case(delta<=0, 0, delta>0, delta) | timechart sum(absdelta) as delta by host

If this seems high, try upgrading to 6.1.1 as it fixes most issues related to DroppedPackets.

View solution in original post

mdickey_splunk
Splunk Employee
Splunk Employee

Approximately how much traffic (bps) are you capturing, and do you see any WARN or ERROR messages in index=_internal sourcetype=stream:log such as "Max packet queue size exceeded?"

Stream tracks an internal metric called DroppedPackets that it records to index=_internal sourcetype=stream:stats. This represents the number of packets received by the network interface but not processed. You can get a report on this using the following search:

index=_internal sourcetype=stream:stats | spath Output=DroppedPackets path=sniffer{}.captures{}.droppedPackets | eventstats sum(DroppedPackets) by _cd | rename sum(DroppedPackets) as SumDroppedPackets | streamstats current=t global=f window=2 earliest(SumDroppedPackets) as prev latest(SumDroppedPackets) as curr by host | eval delta=curr-prev | eval absdelta=case(delta<=0, 0, delta>0, delta) | timechart sum(absdelta) as delta by host

If this seems high, try upgrading to 6.1.1 as it fixes most issues related to DroppedPackets.

gesman
Communicator

Thank you so much!
This was of a great help.
I just briefly looked into that and see dropped packets are into hundreds of thousands (of total of approx 1.5M events per day)
Will try to do an upgrade and see if there is an improvement.

I also noticed that few times my realtime alerts were not triggered, even though the data was indexed. When trying to run alert query manually - it finds alertable results for which no alerts were triggered before. Not sure if this somehow related but it seems some sort of congestion is going on.

0 Karma

mdickey_splunk
Splunk Employee
Splunk Employee

It could be.. if your indexers are having problems, it can create a backlog that blocks stream, which in turn would result lots of missing packets. You really should not be seeing ANY DroppedPackets at all.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...