Getting Data In

Defining Timestamp for HEC Input

Kieffer87
Communicator

I'm running into a strange issue where Splunk is using the current time for a HTTP Event Collector input rather than pulling out the timestamp field I've defined in props.conf. I started by cloning the _json sourcetype and made a few adjustments as event parsing and field extraction were working as expected. I've tried using both the TIMESTAMP_FIELDS and TIME_PREFIX in props.conf without any luck. I'm using a python script to query the Github API and I'm then passing the JSON to splunk_handler.

Payload

[SplunkHandler DEBUG] Sending payload: {"event": "[{\"created_at\": \"2019-01-18T15:24:13Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:24:51Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12345\", \"pr_number\": 12345, \"repo_name\": \"somerepo\"}, {\"created_at\": \"2019-01-18T14:56:27Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:09:42Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12346\", \"pr_number\": 12346, \"repo_name\": \"somerepo\"}]", "host": "myhost", "index": "prmetrics", "source": "test", "sourcetype": "json-github"}

Raw Event Text (as shown in Splunk)

{"created_at": "2019-01-17T21:20:55Z", "pr_user": "userid123", "merged_at": "2019-01-18T14:10:37Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}

props.conf

[json-github]
INDEXED_EXTRACTIONS = json
KV_MODE = none
NO_BINARY_CHECK = true
disabled = false
SHOULD_LINEMERGE = false
TIME_PREFIX = \{\"created_at\"\:\s\"
MAX_TIMESTAMP_LOOKAHEAD = 50
#TIMESTAMP_FIELDS = created_at
1 Solution

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

View solution in original post

AlexHauptner
Engager

There is a solution for Splunk > 7.2 : INGEST_EVAL

props.conf

 

[json-github]
...
DATETIME_CONFIG = CURRENT
TRANSFORMS-get-date = construct_date

 

transforms.conf

 

[construct_date]
INGEST_EVAL=_time=strptime(substr(_raw,17,20),"%Y-%m-%dT%H:%M:%SZ")

 

 

This works for HEC-event and HEC-raw endpoint!

 

For further information look at: https://conf.splunk.com/files/2020/slides/PLA1154C.pdf

j3rb0i
Engager

I ended up using the following thanks to this tip.  Works flawlessly:

INGEST_EVAL=_time=strptime(spath(_raw,"timestamp"), "%Y-%m-%dT%H:%M:%S%3N%z")
0 Karma

ljc01
New Member

Which HEC endpoint are you using (raw|event)? I assume you’re using the event endpoint based on your post. I do not believe Splunk will let you overwrite the time field when you use that endpoint. If you want to set/override the time field dynamically you will need to use the raw endpoint. While not tested, I assume you cannot overwrite any of the meta data fields when using the event endpoint based on the timestamp override issue you are experiencing.

I’m not sure why this is the desired behavior. It would be nice to get clarification from a Splunk HEC dev.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

prakash007
Builder

@Kieffer87 : did you try using TIME_FORMAT in your configs...

    MAX_TIMESTAMP_LOOKAHEAD = 20
    TIME_FORMAT = %Y-%m-%dT%H:%M:%S%Z

Kieffer87
Communicator

Yes, and I get the same result, the timestamp defaults to the time the data is received.

0 Karma

prakash007
Builder

what's your workflow, do yo have this config on you Splunk-HEC server..??

0 Karma

Kieffer87
Communicator

Running python script from my machine. Have a single instance of Splunk running HEC/Indexing/Search.

0 Karma
Get Updates on the Splunk Community!

Synthetic Monitoring: Not your Grandma’s Polyester! Tech Talk: DevOps Edition

Register today and join TekStream on Tuesday, February 28 at 11am PT/2pm ET for a demonstration of Splunk ...

Instrumenting Java Websocket Messaging

Instrumenting Java Websocket MessagingThis article is a code-based discussion of passing OpenTelemetry trace ...

Announcing General Availability of Splunk Incident Intelligence!

Digital transformation is real! Across industries, companies big and small are going through rapid digital ...