Getting Data In

Defining Timestamp for HEC Input

Kieffer87
Communicator

I'm running into a strange issue where Splunk is using the current time for a HTTP Event Collector input rather than pulling out the timestamp field I've defined in props.conf. I started by cloning the _json sourcetype and made a few adjustments as event parsing and field extraction were working as expected. I've tried using both the TIMESTAMP_FIELDS and TIME_PREFIX in props.conf without any luck. I'm using a python script to query the Github API and I'm then passing the JSON to splunk_handler.

Payload

[SplunkHandler DEBUG] Sending payload: {"event": "[{\"created_at\": \"2019-01-18T15:24:13Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:24:51Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12345\", \"pr_number\": 12345, \"repo_name\": \"somerepo\"}, {\"created_at\": \"2019-01-18T14:56:27Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:09:42Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12346\", \"pr_number\": 12346, \"repo_name\": \"somerepo\"}]", "host": "myhost", "index": "prmetrics", "source": "test", "sourcetype": "json-github"}

Raw Event Text (as shown in Splunk)

{"created_at": "2019-01-17T21:20:55Z", "pr_user": "userid123", "merged_at": "2019-01-18T14:10:37Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}

props.conf

[json-github]
INDEXED_EXTRACTIONS = json
KV_MODE = none
NO_BINARY_CHECK = true
disabled = false
SHOULD_LINEMERGE = false
TIME_PREFIX = \{\"created_at\"\:\s\"
MAX_TIMESTAMP_LOOKAHEAD = 50
#TIMESTAMP_FIELDS = created_at
1 Solution

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

View solution in original post

AlexHauptner
Engager

There is a solution for Splunk > 7.2 : INGEST_EVAL

props.conf

 

[json-github]
...
DATETIME_CONFIG = CURRENT
TRANSFORMS-get-date = construct_date

 

transforms.conf

 

[construct_date]
INGEST_EVAL=_time=strptime(substr(_raw,17,20),"%Y-%m-%dT%H:%M:%SZ")

 

 

This works for HEC-event and HEC-raw endpoint!

 

For further information look at: https://conf.splunk.com/files/2020/slides/PLA1154C.pdf

j3rb0i
Engager

I ended up using the following thanks to this tip.  Works flawlessly:

INGEST_EVAL=_time=strptime(spath(_raw,"timestamp"), "%Y-%m-%dT%H:%M:%S%3N%z")
0 Karma

ljc01
New Member

Which HEC endpoint are you using (raw|event)? I assume you’re using the event endpoint based on your post. I do not believe Splunk will let you overwrite the time field when you use that endpoint. If you want to set/override the time field dynamically you will need to use the raw endpoint. While not tested, I assume you cannot overwrite any of the meta data fields when using the event endpoint based on the timestamp override issue you are experiencing.

I’m not sure why this is the desired behavior. It would be nice to get clarification from a Splunk HEC dev.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

prakash007
Builder

@Kieffer87 : did you try using TIME_FORMAT in your configs...

    MAX_TIMESTAMP_LOOKAHEAD = 20
    TIME_FORMAT = %Y-%m-%dT%H:%M:%S%Z

Kieffer87
Communicator

Yes, and I get the same result, the timestamp defaults to the time the data is received.

0 Karma

prakash007
Builder

what's your workflow, do yo have this config on you Splunk-HEC server..??

0 Karma

Kieffer87
Communicator

Running python script from my machine. Have a single instance of Splunk running HEC/Indexing/Search.

0 Karma
Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...