Anyone of you has a best practice on implementing the best polling interval of each machine data? I am still puzzled on how to determine the polling rate of each. Make it too frequent, it will generate noise, make it too slow, it will reduce accuracy. Any help guys on what are the things that I need to consider?
Thank you so much
It depends on what you are going to do with the metrics and your use cases. If its about scheduling alert, minimum preferable interval should be 1-2 minutes. And then your polling interval should be adjusted to that.
For almost all the metrics , we are polling every 5 minutes for most of the system metrics and 1 day for ssl certificates, licenses etc which does not change in few minutes.
Refer here also
https://answers.splunk.com/answers/5136/time-interval-for-searches.html
https://wiki.splunk.com/Community:Best_practices_for_Splunk_alerting
Thanks for your suggestions. But how can you say that you need data every 5 minutes? why not every 4? or 6? do you calculate data integrity over noise? I think what I need here is the proof why I choose the interval and why not something else.
Its completely depends on your requirement. For example polling interval for my disk space is every 5 minutes. Because i know there are no process which might right 100GB of data in 5 minutes. But it might not be your case. So depends on what you collect and how you use and for what purpose, it changes 🙂
I appreciate your feedback and suggestion, but dont you guys have any framework on how do you determine each machine data that you are sampling? Thanks
Isn't it considered a "guess" when you just put x hour on this x machine data just because you know that it doesnt change every x minute?
Sorry but no we don't have any such framework
Are you referring to the system metrics like CPU,RAM,DISK etc?
yes. Every machine data that needs to be monitored. How are guys determine what is the best polling rate of each? Do you have a computation or something?