I have been through the blogs below on HTTP event collector, but I'm looking for detailed explanation/use cases on using the HTTP event collector.
According to my understanding, are we sending the data directly to an indexer using a HEC without a universal forwarder..?
In what scenarios would this be helpful..?
Any explanation would be appreciated. Thanks.
Hello @mcnamara,
You are right. More details from
http://blogs.splunk.com/2015/10/06/http-event-collector-your-direct-event-pipe-to-splunk-6-3/
HTTP Event Collector (EC) is a new, robust, token-based JSON API for sending events to Splunk from anywhere without requiring a forwarder. It is designed for performance and scale. Using a load balancer in front, it can be deployed to handle millions of events per second. It is highly available and it is secure. It is easy to configure, easy to use. A few other cool tidbits, it supports gzip compression, batching, HTTP keep-alive and HTTP/HTTPs.
If you are a developer looking to get visibility into your applications within Splunk, looking to capture events from external systems and devices (IoT), or you offer a product that you’d like to integrate with Splunk, HTTP Event Collector is the way to go
Picking up one example from @DamienDallimore , you can enable HEC in the java instrumentation app https://splunkbase.splunk.com/app/1716/
which is an instrumentation agent for tracing code level metrics via bytecode injection, JMX attributes/operations/notifications and decoded HPROF records and streaming these events directly into Splunk. The jvm might be running in any machine or container and you can collect the data directly from the source without the need of a forwader
Hi @mcnamara,
If it can help you, I have pushed a very simple python3 script template to test sending events to Splunk HTTP EC here: https://github.com/Julien-Bernard/scripts/blob/master/splunk_http_event_collector_template.py.
Hello @mcnamara adding to the what was said above. HEC can be enabled on a heavy weight forwarder or an indexer. We don't support UF currently.
You can read more on our distributed deployment options here.
As to why we don't support UF, the main reason is that UF was designed to piggy back on a machine where other processes are running which it listens to. In the case of HEC it is designed for high volume and high scale. Having a dedicated tier of HWFs running HEC behind a load balancer (see the doc) like nginx makes sense for these kind of scenarios as they are dedicated.
As to indexers vs HWF, there are different tradeoffs to consider. Having HEC running on a single indexer is the out of the box single instance configuration. Running on multiple indexers is simpler to deploy for some cases as you don't need an additional special tier. However for maximum scale and throughput we'd recommend HWFs.
Hope this helps.
@gblock...Thanks for the reply...
let's say if i have a mobile or client application, can i ask developer to send the data to HWF in a JSON format..?
client/mobile application---------------->HWF-------------------------->Splunk Indexer
1.Do i need to enable HEC on both HWF and Indexer..?
2.In the above workflow i want my HWF to not index locally, just route the data to Splunk Indexer.(does this tweak on HWF output.conf(indexAndForward = false) help me to use HWF to just forward)..?
3.After the data has been indexed by splunk indexer, does splunk autoextract key-value pairs even from JSON format..?
No, you don't have to enable HEC on the Indexer, only the forwarder.
Yes, it will just forward not index locally by default
It does autokv at search time but not at index time. We don't support JSON indexed extraction.
It does support regex extractions and transforms.
Yes a mobile device can send data directly to HWF. This is what we demoed in the keynote at .conf. We also have an app on our Github that shows this:
If you are sending mobile data from a browser, you will need to enable CORS in HEC. If the payload is HTTPS then you will need a valid HTTPS cert if you use some browsers.
Hello @mcnamara,
You are right. More details from
http://blogs.splunk.com/2015/10/06/http-event-collector-your-direct-event-pipe-to-splunk-6-3/
HTTP Event Collector (EC) is a new, robust, token-based JSON API for sending events to Splunk from anywhere without requiring a forwarder. It is designed for performance and scale. Using a load balancer in front, it can be deployed to handle millions of events per second. It is highly available and it is secure. It is easy to configure, easy to use. A few other cool tidbits, it supports gzip compression, batching, HTTP keep-alive and HTTP/HTTPs.
If you are a developer looking to get visibility into your applications within Splunk, looking to capture events from external systems and devices (IoT), or you offer a product that you’d like to integrate with Splunk, HTTP Event Collector is the way to go
Picking up one example from @DamienDallimore , you can enable HEC in the java instrumentation app https://splunkbase.splunk.com/app/1716/
which is an instrumentation agent for tracing code level metrics via bytecode injection, JMX attributes/operations/notifications and decoded HPROF records and streaming these events directly into Splunk. The jvm might be running in any machine or container and you can collect the data directly from the source without the need of a forwader
@renjith, @gblock...Thanks for the explanation, i wish i will implement this soon. 🙂
No problem. Good luck!
Thanks renjith..
It says using HEC the data is transferred over HTTP, what would be the transfer protocol when we use UF----->Splunk.
If you are using HWF with HEC, then HTTP is the protocol that the HEC endpoint listens to. It then forwards on to Splunk using our proprietary S2S protocol which is TCP/IP.
If I'm not wrong, Splunk uses their "Splunk protocol" inside TCP which is Splunk proprietary