Getting Data In

Finding a timestamp in a large event

alekksi
Communicator

Hi all,

I am putting some JSON events into Splunk which are rather large (can be upwards of 100K characters). This is due in part to the API that the data is being fetched from. I'm investigating other means of of cutting the events down to size, but the main issue I have is that timestamp recognition often fails with an event this large.

So the config itself is easy enough:

[my_sourcetype]
TIME_PREFIX = \"start\": \"
TIME_FORMAT=%s
MAX_TIMESTAMP_LOOKAHEAD=13
TRUNCATE=0

as the bit of JSON I'm looking for looks like this:

{"QueryTime": {"start": "1477390200000", "end": "1477390799999"}

Is there anything I can do to ensure the correct timestamp recognition of an event that is 50K+ characters long?

Thanks and best regards,
Alex

0 Karma

inventsekar
SplunkTrust
SplunkTrust

%s is 10 digit epoch time. the last 3 digits are milli seconds (%3N)
can you please try -

    TIME_FORMAT=%s%3N

for TIME_PREFIX, did you try including the QueryTime as well ?!?!

TIME_PREFIX={\"QueryTime\": {\"start\": \" 

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have had %s%3N before, but it didn't really make much odds either way -- though I'll put it back in for further testing.

Unfortunately it's not always in that order as it can just as easily output the data as:

{"QueryTime": {"end": "1477390799999", "start": "1477390200000"}

Which then means I have to apply some regex in order to get past this possibility.

0 Karma

inventsekar
SplunkTrust
SplunkTrust

lets try this regex...
TIME_PREFIX=.*start\":\s\"\d{13}
or
TIME_PREFIX=.*start\S\S\s\S\d{13}

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

alekksi
Communicator

So I have much the same issue with this as with previous regex. It seems that the indexers will only look up to a given number of characters before it gives up, seems to be about 10K characters.

I've had other regex patterns that have looked for other timestamps in the JSON, but these seem even less reliable than just looking for the start time of the data set requested from the API.

0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...