We have an index that gets around 2million events/hour and it seems not a sizable number of events are not making it from the manager to our splunk instance. At the very least we are talking about 60,000 events in a 24 hours period. This would seem to be beyond the normal expected loss for connectionless UDP. Is it possible splunk is being inundated with so many events that some are being discarded?
You can examine the performance of Splunk by examining the thruput for that particular input. Using UDP as the network protocol is not recommended if you are concerned about data loss. There is the following wiki topic that details tuning recommendations and some troubleshooting tips:
http://www.splunk.com/wiki/Community:UDPInputs
The capacity of a Splunk instance is mostly determined by the hardware. Our reference architecture (for handling 100 GB/day) is capable of handling peak thruput in excess of 3 MB/sec. I have seen up to 10 MB/sec in some cases.
You can examine the performance of Splunk by examining the thruput for that particular input. Using UDP as the network protocol is not recommended if you are concerned about data loss. There is the following wiki topic that details tuning recommendations and some troubleshooting tips:
http://www.splunk.com/wiki/Community:UDPInputs
The capacity of a Splunk instance is mostly determined by the hardware. Our reference architecture (for handling 100 GB/day) is capable of handling peak thruput in excess of 3 MB/sec. I have seen up to 10 MB/sec in some cases.