Getting Data In

Finding a timestamp in a large event

alekksi
Communicator

Hi all,

I am putting some JSON events into Splunk which are rather large (can be upwards of 100K characters). This is due in part to the API that the data is being fetched from. I'm investigating other means of of cutting the events down to size, but the main issue I have is that timestamp recognition often fails with an event this large.

So the config itself is easy enough:

[my_sourcetype]
TIME_PREFIX = \"start\": \"
TIME_FORMAT=%s
MAX_TIMESTAMP_LOOKAHEAD=13
TRUNCATE=0

as the bit of JSON I'm looking for looks like this:

{"QueryTime": {"start": "1477390200000", "end": "1477390799999"}

Is there anything I can do to ensure the correct timestamp recognition of an event that is 50K+ characters long?

Thanks and best regards,
Alex

0 Karma

inventsekar
SplunkTrust
SplunkTrust

%s is 10 digit epoch time. the last 3 digits are milli seconds (%3N)
can you please try -

    TIME_FORMAT=%s%3N

for TIME_PREFIX, did you try including the QueryTime as well ?!?!

TIME_PREFIX={\"QueryTime\": {\"start\": \" 

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have had %s%3N before, but it didn't really make much odds either way -- though I'll put it back in for further testing.

Unfortunately it's not always in that order as it can just as easily output the data as:

{"QueryTime": {"end": "1477390799999", "start": "1477390200000"}

Which then means I have to apply some regex in order to get past this possibility.

0 Karma

inventsekar
SplunkTrust
SplunkTrust

lets try this regex...
TIME_PREFIX=.*start\":\s\"\d{13}
or
TIME_PREFIX=.*start\S\S\s\S\d{13}

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have much the same issue with this as with previous regex. It seems that the indexers will only look up to a given number of characters before it gives up, seems to be about 10K characters.

I've had other regex patterns that have looked for other timestamps in the JSON, but these seem even less reliable than just looking for the start time of the data set requested from the API.

0 Karma
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...