Getting Data In

How to extract an event timestamp where seconds and milliseconds are concatenated without padded zeros?

Motivator

I came across a weird log format where the seconds and milliseconds are concatenated without padded zeros.

Example data

2019,8,6,9,31,1,event data
2019,8,6,9,31,12,event data
2019,8,6,9,31,123,event data
2019,8,6,9,31,1234,event data
2019,8,6,9,31,12345,event data

Problem
From my testing TIME_FORMAT doesn't work correctly in this case. It would if this number had padded zeros (e.g 00012)
Formats I tested and the results
%Y,%m,%d,%H,%M,%S%3N - works on the 5 digit but not the others since they show the wrong amount of seconds
%Y,%m,%d,%H,%M,%S - same as before, in most cases it shows the wrong amount of seconds
%Y,%m,%d,%H,%M,%5N - doesn't extract anything after the minutes

How can I solve this without building a custom input or pre-processing the data before indexing it?

------------
Hope I was able to help you. If so, an upvote would be appreciated.
1 Solution

Motivator

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, an upvote would be appreciated.

View solution in original post

Motivator

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, an upvote would be appreciated.

View solution in original post