Getting Data In
Highlighted

Best practice for overriding source key in inputs.conf?

I've seen the related question "Override source key in inputs.conf".

I've pretty much decided that I do want to override the source key (although I'm open to counterarguments): the question now is, to what?

Here's my situation: I'm using a proprietary, platform-specific tool to extract many types of log records from various systems on that platform. I'm then sending those extracted log records to a remote Splunk instance via either HTTP (that is, to the Splunk HTTP Event Collector; EC) or TCP.

For the purposes of this question, I'm going to refer to that log extraction tool as xyz.

Events ingested via EC have the source field value http:xyz, where xyz is the name of the Event Collector token that I created for this purpose, deliberately matching the name of the tool. I am dimly aware of the possibility - although no use case occurs to me right now - that, in the future, I might want to create additional EC tokens for xyz; perhaps I'll append qualifying terms with an underscore separator, I'm not sure.

Events ingested via TCP have the default source field value tcp:6666, where 6666 is the TCP port.

I don't feel that comfortable with this default source value for the TCP-ingested events. I'd prefer a more "mnemonic" value that doesn't refer to a specific port number. In a multisite cluster, indexers might, for site-specific reasons, be listening on different port numbers. I think I'd prefer to have the same source value - for example, tcp:xyz - regardless of which indexer ingests an event, and what TCP port it's listening on.

So, although this naming scheme is likely simplistic - hence this question about best practice; I'm hoping for advice from more experienced users - I'm leaning towards source values in the following format:

input : sender

where sender is, in my case, the tool xyz. So: http:xyz (as now) for the EC-ingested events, and tcp:xyz (instead of the default tcp:6666) for the TCP-ingested events.

Thoughts, suggestions welcome.

For example:

  • Should I use an underscore instead of the colon as a separator? (I realize that the colon implies a protocol rather than some more generalized notion of "input type/method".)
  • Should I reverse the order of these qualifiers: for example, xyz_http?
  • Why don't I use the same source value - perhaps just xyz - regardless of input (ingestion) method? Difficult to put my finger on many concrete reasons. Perhaps one: I'm sending JSON to both EC and TCP, but the JSON structure is slightly different (I wish it wasn't). If I need to debug ingestion issues, it might be helpful to be able to differentiate the events; but then, the inherent differences in the structure of the JSON payloads means I can already do that.

I understand that some of this might come down to personal preference, but I'm interested in what other people are doing, and why.

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Esteemed Legend

I would use HEC:xyz where HEC is the common name for HTTP Event Collector.

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Hi @woodcock, thanks for the suggestion:

I would use HEC:xyz where HEC is the common name for HTTP Event Collector.

How common?

The first Splunk blog post tagged http-event-collector, "HTTP Event Collector, your DIRECT event pipe to Splunk 6.3", uses the abbreviation EC:

HTTP Event Collector (EC) is a new, robust, token-based JSON API

So does the Splunk dev topic "Introduction to Splunk HTTP Event Collector":

Welcome to Splunk HTTP Event Collector (EC)

So does the "Walkthrough" dev topic:

the EC port ... an HTTP Event Collector authentication token ("EC token"). EC tokens are ... the EC event protocol ...

But then, the latest Splunk blog post tagged `http-event-collector, "There is a “LOG”! Introducing Splunk Logging Driver in Docker 1.10.0", on 10 February 2016, refers to HEC:

Built on the HTTP Event Collector (HEC) ... Enable HEC ... Create a New HEC Token

And Googling for:

"HTTP Event Collector (HEC)" site:splunk.com

returns "about 38 results", whereas:

"HTTP Event Collector (EC)" site:splunk.com

returns "about 32 results".

If any Splunk tech writers are reading this: what's the official abbreviation: EC or HEC?

Highlighted

Re: Best practice for overriding source key in inputs.conf?

SplunkTrust
SplunkTrust

Good spotting!

The Splunk Splexicon does not list any of EC http://docs.splunk.com/Splexicon#anchorE nor HEC http://docs.splunk.com/Splexicon#anchorH

A search for EC http://docs.splunk.com/Special:SplunkSearch/docs?q=ec returns 8 results and a search for HEC http://docs.splunk.com/Special:SplunkSearch/docs?q=hec returns 0 results.

cheers, MuS

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Esteemed Legend

I was judging from speakers @ Splunk events and also blogs. My experience there is that HEC is far more prevalent than EC. You sleuthing has shown that clearly a documentation cleanup and official statement on the matter is warranted.

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Splunk Employee
Splunk Employee

I know which acronym marketing likes, and it's not HEC. I agree that some clarity is needed.

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Splunk Employee
Splunk Employee

Okay, I have spoken with the product manager about this and the inconsistency you see results from a change in usage. When we first introduced the feature, we officially abbreviated it as EC. Over time, HEC became more widely used, and we have adapted our standard to reflect that. The correct abbreviation is now HEC. We are updating the docs and original blog post to reflect this change.

Highlighted

Re: Best practice for overriding source key in inputs.conf?

Splunk Employee
Splunk Employee

@woodcock Hal is right, initially we had decided to not use HEC. However that boat has since shipped and everyone is using it anyway. So, we have relaxed that and we are going to update our docs, which is why you are seeing HEC show up now. Thanks for reporting this.

0 Karma
Highlighted

Re: Best practice for overriding source key in inputs.conf?

Splunk Employee
Splunk Employee

"Here ye, here he, henceforth HEC is a permitted term and you may use it without fear!"

Highlighted

Re: Best practice for overriding source key in inputs.conf?

Splunk Employee
Splunk Employee

lol I blame @damian dallimore

0 Karma