I've seen the related question "Override source key in inputs.conf".
I've pretty much decided that I do want to override the source key (although I'm open to counterarguments): the question now is, to what?
Here's my situation: I'm using a proprietary, platform-specific tool to extract many types of log records from various systems on that platform. I'm then sending those extracted log records to a remote Splunk instance via either HTTP (that is, to the Splunk HTTP Event Collector; EC) or TCP.
For the purposes of this question, I'm going to refer to that log extraction tool as
Events ingested via EC have the
source field value
xyz is the name of the Event Collector token that I created for this purpose, deliberately matching the name of the tool. I am dimly aware of the possibility - although no use case occurs to me right now - that, in the future, I might want to create additional EC tokens for
xyz; perhaps I'll append qualifying terms with an underscore separator, I'm not sure.
Events ingested via TCP have the default
source field value
tcp:6666, where 6666 is the TCP port.
I don't feel that comfortable with this default
source value for the TCP-ingested events. I'd prefer a more "mnemonic" value that doesn't refer to a specific port number. In a multisite cluster, indexers might, for site-specific reasons, be listening on different port numbers. I think I'd prefer to have the same
source value - for example,
tcp:xyz - regardless of which indexer ingests an event, and what TCP port it's listening on.
So, although this naming scheme is likely simplistic - hence this question about best practice; I'm hoping for advice from more experienced users - I'm leaning towards
source values in the following format:
where sender is, in my case, the tool
http:xyz (as now) for the EC-ingested events, and
tcp:xyz (instead of the default
tcp:6666) for the TCP-ingested events.
Thoughts, suggestions welcome.
sourcevalue - perhaps just
xyz- regardless of input (ingestion) method? Difficult to put my finger on many concrete reasons. Perhaps one: I'm sending JSON to both EC and TCP, but the JSON structure is slightly different (I wish it wasn't). If I need to debug ingestion issues, it might be helpful to be able to differentiate the events; but then, the inherent differences in the structure of the JSON payloads means I can already do that.
I understand that some of this might come down to personal preference, but I'm interested in what other people are doing, and why.
@woodcock Hal is right, initially we had decided to not use HEC. However that boat has since shipped and everyone is using it anyway. So, we have relaxed that and we are going to update our docs, which is why you are seeing HEC show up now. Thanks for reporting this.
Hi @woodcock, thanks for the suggestion:
I would use
HECis the common name for HTTP Event Collector.
The first Splunk blog post tagged
http-event-collector, "HTTP Event Collector, your DIRECT event pipe to Splunk 6.3", uses the abbreviation EC:
HTTP Event Collector (EC) is a new, robust, token-based JSON API
So does the Splunk dev topic "Introduction to Splunk HTTP Event Collector":
Welcome to Splunk HTTP Event Collector (EC)
So does the "Walkthrough" dev topic:
the EC port ... an HTTP Event Collector authentication token ("EC token"). EC tokens are ... the EC event protocol ...
But then, the latest Splunk blog post tagged `http-event-collector, "There is a “LOG”! Introducing Splunk Logging Driver in Docker 1.10.0", on 10 February 2016, refers to HEC:
Built on the HTTP Event Collector (HEC) ... Enable HEC ... Create a New HEC Token
And Googling for:
"HTTP Event Collector (HEC)" site:splunk.com
returns "about 38 results", whereas:
"HTTP Event Collector (EC)" site:splunk.com
returns "about 32 results".
If any Splunk tech writers are reading this: what's the official abbreviation: EC or HEC?
Okay, I have spoken with the product manager about this and the inconsistency you see results from a change in usage. When we first introduced the feature, we officially abbreviated it as EC. Over time, HEC became more widely used, and we have adapted our standard to reflect that. The correct abbreviation is now HEC. We are updating the docs and original blog post to reflect this change.
I was judging from speakers @ Splunk events and also blogs. My experience there is that HEC is far more prevalent than EC. You sleuthing has shown that clearly a documentation cleanup and official statement on the matter is warranted.
A search for
EC http://docs.splunk.com/Special:SplunkSearch/docs?q=ec returns 8 results and a search for
HEC http://docs.splunk.com/Special:SplunkSearch/docs?q=hec returns 0 results.