I am using a HEC and configured a custom source type that sets _time based on a field in the JSON data and when using the "add data" sample data, it works great. _time gets updated, however, when actually sending data to the HEC, _time stays at indexed time (not the _time based on the data).
To give the concrete example, in the JSON i have this line:
"timestampStr": "2022-06-03 19:38:19.736995059",
And built this sourcetype:
[_j_son_logan_test]
DATETIME_CONFIG =
LINE_BREAKER = \}()\{
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
disabled = false
BREAK_ONLY_BEFORE_DATE =
SHOULD_LINEMERGE = false
TIME_PREFIX = \"timestampStr\": \"
TIME_FORMAT =
KV_MODE = json
INDEXED_EXTRACTIONS = json
And when using the Settings --> Add Data option, and selecting that Source Type, _time shows as 2022-06-03 19:38:19.736995059
However, when I sent that json blob via curl to the HEC (which is set to a particular index and to use that sourcetype), the _time value shows the time it was index (i.e. right now (2022-06-24)).
In looking at the data itself, (index="my_index"), the sourcetype column shows _j_son_logan_test
Not sure what to check next, but open to thoughts and thank you!
So the dual flags issue wasn't the issue, but I did find (from the article you linked!) that I needed to send to the raw endpoint, and that works!
For those (noobies like me!) this means changing the URL to
curl -k https://ipaddress:8088/services/collector/raw
Instead of the
curl -k https://ipaddress:8088/services/collector/
I was sending to.
See if this answer helps: https://community.splunk.com/t5/Getting-Data-In/Defining-Timestamp-for-HEC-Input/m-p/413425
Also, it's not advised to specify both KV_MODE=json and INDEXED_EXTRACTIONS=json as it's been said to result in double the field extractions.
So the dual flags issue wasn't the issue, but I did find (from the article you linked!) that I needed to send to the raw endpoint, and that works!
For those (noobies like me!) this means changing the URL to
curl -k https://ipaddress:8088/services/collector/raw
Instead of the
curl -k https://ipaddress:8088/services/collector/
I was sending to.
But.
If you're explicitly sending to a HEC endpoint you should know what your timestamp is. So it's easier for the inder/HF to not have to parse the timestamp out of the raw event. You can simply supply it with your event and be done with it. It also speeds up ingestion since you don't have to waste time for timestamp extraction.
Think about it.
While I am confident you are right, I do not know what " You can simply supply it with your event " means, and so I'm stuck extracting it (unless you have more specifics with what 'simply supply it' means)?
THANK YOU!
If you do a REST API request to HEC /collector (or /collector/event) endpoint, you're providing an event along with possible other fields (index, sourcetype, source, time) as well as custom indexed fields.
You can set your payload to include time value (as epoch timestamp with miliseconds). This way you have an absolute timestamp, you have no issues with timezone parsing and so on. I do that on regular basis.
Thanks! Looking at that document and when you say 'payload' that is the actual json message coming in, ya? So does that mean if we alter our JSON:
"event":{ "resourceId": "enum:172.17.2.238", "timestamp":"1654285099736"}
to this
"event":{ "resourceId": "enum:172.17.2.238", "time":"1654285099736"}
It will 'read' it naturally/natively?
Also, note, that our devs send epoch in MS (not the <sec>.<ms>) format specified in the doc you sent, so we may have to request that change as well.
Thank you! This is great!
Close. You either send it as text event and time or json structure and time.
So you can either send it as (full HEC payload):
{
"event": {
"resourceId": "enum:172.17.2.238",
"another_field": "another_value",
"and_so_on": "and_so_on
},
"time": 1654285099.736
}
Or
{
"event": "{\"resourceID\": \"enum:172.17.2.238\" [...] }",
"time" : 1654285099.736
}
If your software generates json anyway, it's of course more convenient to supply the former part - with event json data simply embedded within the "event" field.
The second form is useful mostly when you're forwarding a pre-formatted data from another system or something like that.
wow this is great! had NO IDEA about this and this is a really big help. thank you for the specific example, sir!
Hi loganramirez,
Start with the obvious and check splunkd.log for errors like truncation for the HEC input and if not done already restart the HEC instance; new parsing configs most likely require a restart to be applied. Check with btool if your props.conf is really applied and not gets 'overwritten' by other settings. Check for typos in the sourcetype and case matching 😉
Also try another tool like nc to send the test event just to rule out that it's not curl related.
Hope this helps ...
cheers, MuS