Getting Data In

Defining Timestamp for HEC Input

Communicator

I'm running into a strange issue where Splunk is using the current time for a HTTP Event Collector input rather than pulling out the timestamp field I've defined in props.conf. I started by cloning the json sourcetype and made a few adjustments as event parsing and field extraction were working as expected. I've tried using both the TIMESTAMPFIELDS and TIME_PREFIX in props.conf without any luck. I'm using a python script to query the Github API and I'm then passing the JSON to splunk_handler.

Payload

[SplunkHandler DEBUG] Sending payload: {"event": "[{\"created_at\": \"2019-01-18T15:24:13Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:24:51Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12345\", \"pr_number\": 12345, \"repo_name\": \"somerepo\"}, {\"created_at\": \"2019-01-18T14:56:27Z\", \"pr_user\": \"userid123\", \"merged_at\": \"2019-01-18T15:09:42Z\", \"pr_url\": \"https://github.com/someorganization/somerepo/pull/12346\", \"pr_number\": 12346, \"repo_name\": \"somerepo\"}]", "host": "myhost", "index": "prmetrics", "source": "test", "sourcetype": "json-github"}

Raw Event Text (as shown in Splunk)

{"created_at": "2019-01-17T21:20:55Z", "pr_user": "userid123", "merged_at": "2019-01-18T14:10:37Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}

props.conf

[json-github]
INDEXED_EXTRACTIONS = json
KV_MODE = none
NO_BINARY_CHECK = true
disabled = false
SHOULD_LINEMERGE = false
TIME_PREFIX = \{\"created_at\"\:\s\"
MAX_TIMESTAMP_LOOKAHEAD = 50
#TIMESTAMP_FIELDS = created_at
1 Solution

SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

View solution in original post

New Member

Which HEC endpoint are you using (raw|event)? I assume you’re using the event endpoint based on your post. I do not believe Splunk will let you overwrite the time field when you use that endpoint. If you want to set/override the time field dynamically you will need to use the raw endpoint. While not tested, I assume you cannot overwrite any of the meta data fields when using the event endpoint based on the timestamp override issue you are experiencing.

I’m not sure why this is the desired behavior. It would be nice to get clarification from a Splunk HEC dev.

0 Karma

SplunkTrust
SplunkTrust

Hi,

As far as I know you need to supply timestamp while formatting your event with sourcetype, source and host for HEC event endpoint but if you want to extract timestamp from your raw data then I guess /collector/event HEC endpoint will not work instead you need to use /collector/raw HEC endpoint

I have tested sample data which you have provided in my lab and it didn't extracted timestamp from raw data with /collector/event HEC endpoint but it worked using /collector/raw HEC endpoint.

I have used curl to ingest data in Splunk using HEC raw endpoint.

curl -vk "https://localhost:8088/services/collector/raw?channel=A1B2C34D-12A3-1234-A123-12ABC1234567&sourcetype=json-github&source=test&host=myhost&index=main" -H "Authorization: Splunk 1ab23cd64-a12b-123a-1ab2-123ab4c56d78" -d '[{"created_at": "2019-01-18T15:24:13Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:24:51Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12345", "pr_number": 12345, "repo_name": "somerepo"}, {"created_at": "2019-01-18T14:56:27Z", "pr_user": "userid123", "merged_at": "2019-01-18T15:09:42Z", "pr_url": "https://github.com/someorganization/somerepo/pull/12346", "pr_number": 12346, "repo_name": "somerepo"}]'

and props.conf

[json-github]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = created_at

View solution in original post

Builder

@Kieffer87 : did you try using TIME_FORMAT in your configs...

    MAX_TIMESTAMP_LOOKAHEAD = 20
    TIME_FORMAT = %Y-%m-%dT%H:%M:%S%Z

Communicator

Yes, and I get the same result, the timestamp defaults to the time the data is received.

0 Karma

Builder

what's your workflow, do yo have this config on you Splunk-HEC server..??

0 Karma

Communicator

Running python script from my machine. Have a single instance of Splunk running HEC/Indexing/Search.

0 Karma