Getting Data In

How to extract an event timestamp where seconds and milliseconds are concatenated without padded zeros?

diogofgm
SplunkTrust
SplunkTrust

I came across a weird log format where the seconds and milliseconds are concatenated without padded zeros.

Example data

2019,8,6,9,31,1,event data
2019,8,6,9,31,12,event data
2019,8,6,9,31,123,event data
2019,8,6,9,31,1234,event data
2019,8,6,9,31,12345,event data

Problem
From my testing TIME_FORMAT doesn't work correctly in this case. It would if this number had padded zeros (e.g 00012)
Formats I tested and the results
%Y,%m,%d,%H,%M,%S%3N - works on the 5 digit but not the others since they show the wrong amount of seconds
%Y,%m,%d,%H,%M,%S - same as before, in most cases it shows the wrong amount of seconds
%Y,%m,%d,%H,%M,%5N - doesn't extract anything after the minutes

How can I solve this without building a custom input or pre-processing the data before indexing it?

------------
Hope I was able to help you. If so, some karma would be appreciated.
1 Solution

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.

View solution in original post

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...