Monitoring Splunk

How to find events dropped

hectorvp
Communicator

Is there any way to find how many events were dropped by UF in a day?

Need a daily report to find how may events were dropped by UF.

Can I know number of events dropped?

OR xyz MB of events are dropped?

 

 

1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @hectorvp,

could you better describe what you mean with dropped events?

Splunk takes all the events you configure to take and doesn't drop events unless you tell him to.

In other words, if you lose evets, you have to understand why you're losing them:

are they in duplicated files? Splunk soesn't index twice a file;

is there a wrong parsing? see the time parsing. maybe your events were indexed with a wrong timestamp (e.g.: 10/02/2020 could be intended as 10 of february but also 2 of october).

Splunk doesn't lose events even if Indexers are down, because Universal Forwarders cache events.

There's only one possibility to lose events: if you have to receive syslogs and you aren't using an HA solution: two Heavy Forwarders with a load balancer.

In any case, if you really missed some events the only way to understand it is to compare what you have indexed with the source files.

Ciao.

Giuseppe

View solution in original post

inventsekar
Super Champion

Hi @hectorvp .. 

Q - Is there any way to find how many events were dropped by UF in a day? 

A - nope. if you analyse this situation little more, you can understand that,.. 

1. UF sends the data to indexer. lets say indexer is down. then the data is not dropped, it is still waiting to be read at UF. when indexer comes up again, it will start reading from where it left before going down. 

2. the persistent queues are one solution to look for. 

By default, forwarders and indexers have an in-memory input queue of 500KB. you can configure and increase the size of this queue, so that there will be no concerns about data dropped.

https://docs.splunk.com/Documentation/Splunk/latest/Data/Usepersistentqueues

 

3. the indexer acknowledgement feature is another good solution. it adds little more load on indexer, but it is worth the load. so, with indexer acknowledgement feature, the indexer and UF will have an extra layer of "handshakes", so that UF and indexer always knows that the data is not dropped. 

clear documentation on this indexer acknowledgement feature:

https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Protectagainstlossofin-flightdata

 

(PS - i have given around 350+ karma points so far, received badge for that, if an answer helped you, a karma point would be nice!. we all should start "Learn, Give Back, Have Fun")

gcusello
SplunkTrust
SplunkTrust

Hi @hectorvp,

could you better describe what you mean with dropped events?

Splunk takes all the events you configure to take and doesn't drop events unless you tell him to.

In other words, if you lose evets, you have to understand why you're losing them:

are they in duplicated files? Splunk soesn't index twice a file;

is there a wrong parsing? see the time parsing. maybe your events were indexed with a wrong timestamp (e.g.: 10/02/2020 could be intended as 10 of february but also 2 of october).

Splunk doesn't lose events even if Indexers are down, because Universal Forwarders cache events.

There's only one possibility to lose events: if you have to receive syslogs and you aren't using an HA solution: two Heavy Forwarders with a load balancer.

In any case, if you really missed some events the only way to understand it is to compare what you have indexed with the source files.

Ciao.

Giuseppe

View solution in original post

hectorvp
Communicator

Hi @gcusello ,

I need to  know if in any unwanted situations UF wasn't able to deliver events to indexers.

For ex: If UF crashes , the data  in memory queue i.e parsing queue and output queue is lost.

Need to find if there is anyway to find number of events dropped.

I saw somewhere that there is some internal log where it says "splunkd had begun drop event" (maybe this log would be coming up if syslog is used)

Just thought if we can calculate something by using metrics.log, but didn't find anything concrete yet.

gcusello
SplunkTrust
SplunkTrust

Hi @hectorvp,

if an UF chashes, you shouldn't lose events (unless you lose the hard disk!).

If events are written on files, you can take them when UF restarts and you don't lose them unless you have a retention period less than the time needed to restart UF (e.g. you maintain wineventlogs for 24 hours and you need more time time to restart UF!).

If you lose them, I don't think that's possible to know how many events you loose because if you have still in file, you can index them after restart, if you didn't still have you haven't nothing to compare to know the number.

 Ciao.

Giuseppe

hectorvp
Communicator
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!