Getting Data In

Finding a timestamp in a large event

alekksi
Communicator

Hi all,

I am putting some JSON events into Splunk which are rather large (can be upwards of 100K characters). This is due in part to the API that the data is being fetched from. I'm investigating other means of of cutting the events down to size, but the main issue I have is that timestamp recognition often fails with an event this large.

So the config itself is easy enough:

[my_sourcetype]
TIME_PREFIX = \"start\": \"
TIME_FORMAT=%s
MAX_TIMESTAMP_LOOKAHEAD=13
TRUNCATE=0

as the bit of JSON I'm looking for looks like this:

{"QueryTime": {"start": "1477390200000", "end": "1477390799999"}

Is there anything I can do to ensure the correct timestamp recognition of an event that is 50K+ characters long?

Thanks and best regards,
Alex

0 Karma

inventsekar
SplunkTrust
SplunkTrust

%s is 10 digit epoch time. the last 3 digits are milli seconds (%3N)
can you please try -

    TIME_FORMAT=%s%3N

for TIME_PREFIX, did you try including the QueryTime as well ?!?!

TIME_PREFIX={\"QueryTime\": {\"start\": \" 

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have had %s%3N before, but it didn't really make much odds either way -- though I'll put it back in for further testing.

Unfortunately it's not always in that order as it can just as easily output the data as:

{"QueryTime": {"end": "1477390799999", "start": "1477390200000"}

Which then means I have to apply some regex in order to get past this possibility.

0 Karma

inventsekar
SplunkTrust
SplunkTrust

lets try this regex...
TIME_PREFIX=.*start\":\s\"\d{13}
or
TIME_PREFIX=.*start\S\S\s\S\d{13}

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have much the same issue with this as with previous regex. It seems that the indexers will only look up to a given number of characters before it gives up, seems to be about 10K characters.

I've had other regex patterns that have looked for other timestamps in the JSON, but these seem even less reliable than just looking for the start time of the data set requested from the API.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Watch On Demand the Tech Talk on November 6 at 11AM PT, and empower your SOC to reach new heights! Duration: ...

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...