@gblock, thanks very much for weighing in on this question, much appreciated. I'd hoped to catch your attention.
I'm asking this question primarily on behalf of some developer colleagues who will soon be turning their attention to sending events to Splunk over an IP network. In particular, I want to present them with information to help them decide whether to use HEC or a TCP input. As mentioned in my question, I've already done some research, and have hands-on experience using both HEC and TCP inputs (albeit currently only on a small scale, on a single Splunk instance).
To recap: my question is "Why would I use HEC when I can use TCP?". That is, as further clarified in the details of the question, why would I choose to use HEC in situations where I can use either HEC or TCP?
I'm interpreting your answer in the context of that question.
Point by point:
There are many clients where TCP is not a viable option, such as sending from the browser.
Yes, fair point. However - I sincerely don't mean to be adversarial or otherwise annoy you; I'm grateful for your time, and hope for more advice from you on your subsequent points - this point is not relevant to the specific context of this question, where TCP is a viable option.
Incidentally, and more or less just for fun, this morning I played around sending events from a web browser (Chrome) to a Splunk TCP input. Yes, really (and, yes, I do have better things to do ;-):
xhr = new XMLHttpRequest()
xhr.open("POST", "http://localhost:6067")
xhr.send("{\"my_field\": \"some_value\"}")
with the following stanza in props.conf :
[source::tcp:6067]
KV_MODE = json
LINE_BREAKER = ((^[^{][^\n]*\r\n)*)\{\"[^}]+\}
SHOULD_LINEMERGE = false
The LINE_BREAKER is intended to ditch the multiline HTTP request header.
It kinda works: for each xhr.send , I get two events in Splunk:
The event I want, {"my_field": "some_value"} , with myfield correctly presented as a field.
An unwanted event, with a time stamp 10 seconds earlier (!), consisting only of the multiline HTTP header (which I thought I told LINE_BREAKER to discard!)
I spent some time Googling about automatic HTTP request retries, and whether I can set an Ajax request to use HTTP 1.0 instead of 1.1, but gave up. Maybe I just specified an inappropriate regex?
Interesting, but academic, thanks to HEC. Moving on.
Scale. HEC is stateless and designed to easily scale out across a pool of instances behind a LB.
Again, fair point. But, again, the point of this question is to decide between using HEC and a Splunk TCP input.
A Splunk TCP input is also stateless. Right?
And a Splunk TCP input easily scales out across a pool of instances behind a load balancer (LB), too. Or am I missing something here?
The Splunk dev topic "High volume HTTP Event Collector data collection using distributed deployment" describes using a network traffic load balancer (such as NGINX) in front of several Splunk Enterprise indexers.
Is there any reason why I can't do the same thing - use a TCP LB, such as NGINX or HAProxy - for Splunk TCP traffic?
Performance. We've heavily optimized HEC to handle 100K events or more per instance.
How does that compare with the performance of a Splunk TCP input?
HTTP involves processing that TCP does not, such as parsing an HTTP request header and returning a response with a header (and, in the case of HEC, a JSON-format body).
This is one reason for my original question: if I don't want or need the processing overhead of HTTP versus TCP, why use HEC?
Outside of this processing that is specific to HTTP - and so, an overhead, when compared to TCP - I would have thought that the remainder of the event processing would be common to both HEC and Splunk TCP inputs. Or could be, if it isn't: that's one reason why I recently asked the question "Can I use the HTTP Event Collector JSON event protocol for TCP inputs?".
As you mention in the next point, HEC has rich support for JSON out of the box. Does that "protocol" - for example, specifying the time in the metadata as a Unix Epoch value - improve the performance of HEC versus a Splunk TCP input? If so, why not offer that same JSON structure for TCP inputs? (Or are you deliberately deprecating TCP inputs in favor of HEC?)
Aside: It occurred to me that perhaps you deliberately chose "EC" as the official abbreviation for HEC for this very reason: that you had plans to "roll out" the JSON-based EC metadata/data protocol across other input methods, including TCP. But nope, I was wrong, because you've recently clarified the official abbreviation as being HEC, not EC.
Ease of use. HEC has really rich support for JSON out of the box, you don't have to mess with sourcetypes or bending over backwards with your JSON.
Yes. I describe some of that "bending over" in my question. Much nicer with HEC, thanks.
However, as I mentioned, I don't find this (ease of use) a compelling enough reason to choose HEC over TCP. Unless the "rich support for JSON" comes with a performance benefit (that you don't plan to make available to TCP inputs).
Security....
Yes.
However, in the use cases I expect to see - I didn't mention this in my question - I suspect (although I don't know for sure) that all of this traffic will occur behind a firewall on an intranet or over a VPN.
I look forward to hearing more from you, especially regarding performance.
Not wishing to put words in your mouth, but framing your answer in the context of my question, I think what you're telling me is:
There's a bunch of reasons [why you would use HEC when you can use TCP]: ... Performance
That is, in a nutshell: HEC offers better performance than using TCP inputs. I'd like to hear more about that.
And if that's true, then perhaps the Splunk docs recommendation I cited in my question needs revisiting (or at least, qualifying):
TCP ... is the recommended protocol for sending data from any remote host to your Splunk Enterprise server
... View more