One of the ways we suggest customers address Stream and NetFlow volume-related concerns is via the use of the "Aggregation" feature. With Aggregation, instead of sampling or prefiltering, you can summarize the data coming in using a user-defined key and aggregate fields, over a custom aggregation interval.
For example, you could define the following as your aggregation key:
And the following as your aggregation functions:
If you specified an aggregation interval of 600 seconds (for example), then every 10 minutes you'd get a list of the unique triple (sip,dip,dport) and the three aggregation functions related to each of them. Since this happens before the data is indexed, the affect on your Splunk license is greatly reduced.
This is a great alternative to prefiltering out a lot of potentially useful data, since you still get much of the resolution that you had in the original flow records, but don't have nearly the volume of data.
David J Cavuto, CISSP
Principal Product Manager, Splunk Stream™
... View more