We're investigating how to best help customers who are using both Splunk and other operations management/monitoring tools in complex IT environments.
What we've been hearing is that customers prefer using Splunk for long-term reporting and correlation of IT data, even if the original data is gathered by other tools. What we're also hearing is that customers often have functional Tier-1 operations processes which are tightly linked with existing tools like Nagios or Remedy, and they want Splunk to work well with those existing processes.
So we're thinking that our top integration priorities should be to make it easier to:
The first supports data consolidation, reporting, and correlation scenarios. The second supports consolidation of alerting and tier-1 responsibilty in IT.
Do these sound like the right priorities, and are they in the right order? What else should we be thinking about when it comes to integration with operations management tools?
Is this question for customers? To me internally it sounds right.
There's a sort of timeliness and load distribution piece in here as well.
it's a question for everyone-- Splunk customers as well as folks working for Splunk who understand customer needs.
I would definitely like to see that. Along with that I also like to see an integration with Xymon, a monitoring tool (http://hobbitmon.sourceforge.net/)
2) is more important to me as just about everything we have can already feed into splunk. Ho0wever, we've had many requests to get data OUT of splunk and into other systems. Sending the data over syslog isn't very helpful in our case, but rather we need to extract the data from splunk and send it using other agents.
There is also another type of integration, and that is to call and use Splunk query results (passing in a Splunj query or parameters to a Splunk query) in other apps, and not just wait for alerts to fire.
Our primary interest would be in allowing ad hoc searching of event data that is generated by Entuity Eye of the Storm (EYE). This is a rich data source containing events relating to many parts of a monitored network. Within EYE, users are members of one or more user groups and these user groups have permission settings that determine which groups of managed devices the users have access to. Users can therefore have overlapping access to details of some devices but details of other devices may be denied to them but not others. The integration to Splunk would ideally preserve the user permissions so that one common user login would cause the appropriate access rights to be granted. This would avoid any need to independently manage access rights on both products.
We have a significant amount of data that lives in other systems (perf data, for example) and happen to expose service interfaces to get at that data. Rather than importing that data into splunk, I'd like to be able to somehow call those services from splunk so I can correlate across. A good example would be correlating number of hits/errors to java heap utilization, where historical java heap utilization is a service call away in another system.
I would love to be able to send alerts out via snmp and syslog and forward logs via syslog or other means to help integrate between various data sources.
Would you like to include network information in your log analysis? If so, have a look at http://splunk-base.splunk.com/apps/43328/netflow-based-network-monitoring-beta. It's only a sample app. There's much more to NetFlow than just watching how much http data is passing through a router 🙂