Getting Data In

HEC Sourcetype

thomastaylor
Communicator

Hello everyone! I just have a brief question regarding the HEC input. Our primary data input is the HEC. For new applications that want to forward through our deployed Heavy Forwarder, we must first configure an token for them, and set a sourcetype.

We're advocating for our applications to send data via a JSON format; however, if I were to select the _json sourcetype, this would not be correct. To provide an example of how their logs would look here's a JSON object:

{
    "time": 1426279439, // epoch time
    "host": "localhost",
    "source": "datasource",
    "sourcetype": "txt",
    "event":  "xx.xxx.xxx.xx /web/link/goes/here error 404"
}

I realize that the "event" attribute can be broken down into more key/value pairs, but most applications that want to integrate with our service may not want to separate out everything from their log in key/value pairs since some applications will not have a clear way of doing that.

If we were to provide additional extractions to the "event", it would modify the _json sourcetype (which we wouldn't want). We're assuming the best way around this problem is to duplicate the _json sourcetype and rename it so that we can add additional extractions?

Thanks in advance!

0 Karma
1 Solution

thomastaylor
Communicator

I found exactly what I was looking for going through some documentation. The endpoint that I was looking for was
/services/collector/raw to send raw data with JSON formatting.

View solution in original post

0 Karma

thomastaylor
Communicator

I found exactly what I was looking for going through some documentation. The endpoint that I was looking for was
/services/collector/raw to send raw data with JSON formatting.

View solution in original post

0 Karma

gjanders
SplunkTrust
SplunkTrust

_json has index time field extractions which you may or may not want depending on the data your sending in.

Also that event would not work as it's not JSON format inside the event info:

xx.xxx.xxx.xx /web/link/goes/here error 404

Would likely be rejected at parsing time as it's not true JSON style data.
That said, JSON data is auto-KV'ed at search time so if you didn't want field indexed extractions then you could use any sourcetype you wish and it would work at search time anyway...it's rare to need to extract fields from JSON manually since Splunk can do it out of the box

Alerts for Splunk Admins https://splunkbase.splunk.com/app/3796/
Version Control for Splunk https://splunkbase.splunk.com/app/4355/
0 Karma

thomastaylor
Communicator

Thanks for your answer! I'm concerned that the applications that send to our deployment may not attempt to setup a JSON dictionary; however, I would want them to be able to extract _time from their logs if they chose to send it through HTTPS. It seems like the only clear way of doing this is using the KV value of "time" in the _json sourcetype? Correct?

0 Karma

gjanders
SplunkTrust
SplunkTrust

If you refer to the HTTP event collector documentation

 <endpoint> is the HEC endpoint you want to use. In many cases, you use the /services/collector endpoint for JavaScript Object Notation (JSON)-formatted events or the services/collector/raw endpoint for raw events

If you want to send JSON-style data than you can refer to Format events if you choose the JSON-style and don't pass in a time field then it will use the current system time on the heavy forwarder/indexer and it will not parse the time data from the raw data...

Alerts for Splunk Admins https://splunkbase.splunk.com/app/3796/
Version Control for Splunk https://splunkbase.splunk.com/app/4355/
0 Karma