Getting Data In

How do you extract a timestamp from JSON logs that are being sent to an HTTP Event Collector?

emillg
New Member

Hi,

When sending logs to Splunk Cloud via HTTP Event Collector, Splunk was not able to extract the correct timestamp from the "date" field. But, when I uploaded the logs as a file, Splunk extracted the correct timestamp automatically.

Can someone help? Thanks!

alt text

0 Karma

hernanb
Engager

I had the same problem. This has to do with the end point used to send the data to the HEC; there are two of them, one is the "event" end point, and the 2nd is the "raw" end point. If you are sending data to the "event" end point then you will not be able to parse the data before indexing (to use props and transforms), this is by design, basically, Splunk considers that everything being sent to the "event" end point is properly formatted and it will go directly to indexing. If you want Splunk to get the correct time stamp, you need to make sure that the "time" met key is configured in the payload sent to Splunk, and the value needs to be in epoch format, when you do this, you will get the correct time stamp for your events. Other met keys tha can be used are: index, source, sourcetype.
Here is a curl command that you can use to test sending data to the HEC via the "event" end point:
curl -k -u "x:" "https://:8088/services/collector/event" -d '{"time":"1587590959", "index":"test","sourcetype": "mysourcetype", "event": "Testing events, Testing events!"}'

woodcock
Esteemed Legend

You need to figure out what sourceytpe is being used for these events. Then you need to create a sourcetype-based stanza in props.conf like this:

[YourSourcetypeHere]
TIME_PREFIX = 
TIME_FORMAT = 
MAX_TIMESTAMP_LOOKAHEAD = 

NOTE: If you have overridden the sourcetype anywhere, use the ORIGINAL sourcetype, not the new/overwritten value.

Deploy this to the first full instance of Splunk that handles the events (Heavy Forwarder tier or Indexer tier).
Restart all Splunk instances there.
Send in new data (old events will stay forever broken).
Be sure that you are looking at the new events by using All time in your timepicker and index_earliest=-5m.

0 Karma

PowerPacked
Builder

Hi

Give this a try

INDEXED_EXTRACTIONS = JSON
TIMESTAMP_FIELDS = date

Thanks

0 Karma

emillg
New Member

@PowerPacked Thanks, but that didn't work either

0 Karma

Vijeta
Influencer

@emillg - Check the sourcetype used for HEC and accordingly update the TIMESTAMP configuration in props.conf or by editing sourcetype.

0 Karma

emillg
New Member

@Vijeta
I have tried the following in sourcetype, but didn't work. Did I miss something?

TIME_PREFIX = \"date\":\"
MAX_TIMESTAMP_LOOKAHEAD = 24

The raw text log looks like

{"message":{"date":"2019-02-19T21:32:45.743Z","type":"XXX","description":"","connection_id":"","client_id":"XXX","client_name":"XXX","ip":"XXX","user_agent":"XXX","hostname":"XXX","user_id":"","user_name":"","audience":"XXX","scope":null,"auth0_client":{"name":"auth0-java","version":"1.0.0"},"_id":"XXX","log_id":"XXX","isMobile":false},"severity":"info"}

0 Karma

swebb07g
Path Finder

Did you ever get this working?

0 Karma

jason12vb
Engager

I don't know how to fully solve the OP's issue, but I did figure out how to do it with an epoch that's showing up in the event.

Using an IDX transform on the sourcetype.   (For me, I had the epoch time at the start of _raw.

[set_x-balancer_time]
SOURCE_KEY = _raw
REGEX = ^(\d{10}\.?\d*)\s
FORMAT = $1
DEST_KEY = _time

DEST_KEY = _time  - requires the timestamp to be in epoch format, so in order to get that to work with another timestamp you'd have to find a way to change it into epoch.  

0 Karma

PickleRick
SplunkTrust
SplunkTrust

With event endpoint there is an assumption that the time has already been parsed out and is supplied as time field along with the event data. The event therefore bypasses some steps of parsing queue (timestamp recognition, line breaking) effectively lowering load on the indexer/HF.

But since, I think, 8.0 you can add ?auto_extract_timestamp=true to the endpoint url and the event will go through timestamp parsing phase.

See https://docs.splunk.com/Documentation/Splunk/8.2.2/Data/HECRESTendpoints

jason12vb
Engager

Thanks for the suggestion, I'll try that out.  Not sure how much control I have over the dynamic creation of the curl commands generated though to know whether or not I can send events that need it and not for those that don't.

In my case, I'm using Splunk Connect for Syslog (SC4S).

 

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...