Getting Data In

How to apply an arbitrary offset to the timestamp at index time?

martin_mueller
SplunkTrust
SplunkTrust

I have an input that writes timestamps as the number of milliseconds passed since January 1st 1601 that sadly cannot be changed to either human-readable or a Unix timestamp.

For example, 12995561169293 corresponds to October 24th 2012, 14:06:09. Splunk interprets this as a Unix timestamp, treating the last four digits as milliseconds and 100 microseconds: 1299556116.929(3) corresponding to March 8th 2011, 04:48:36.929.

I can convert "my" timestamp into a Unix timestamp by substracting a constant with an external preprocessing application before loading a file into Splunk. However, I'd prefer it if I could teach Splunk to understand it directly.

The usual sed/regex-transformations at index time cannot do maths to subtract the offset, is there any other way to do the conversion within Splunk?

1 Solution

yannK
Splunk Employee
Splunk Employee

A regex will not be able to do subtractions for you.
It seems that the only method is to use a scripted input that will parse the events before indexing.

View solution in original post

woodcock
Esteemed Legend

You can set TZ=+NumberOfHoursToAddHere:NumberOfMinutesToAddHere in props.conf.
You can also look at a solution using Cribl:
https://www.cribl.io/

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do you have a working example using TZ?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Just under six years later, 7.2 promises a fix \o/

http://docs.splunk.com/Documentation/Splunk/7.2.0/Admin/transformsconf

INGEST_EVAL = <comma-separated list of evaluator expressions>
* NOTE: This setting is only valid for index-time field extractions.
* Optional. When you set INGEST_EVAL, this setting overrides all of the other 
  index-time settings (such as REGEX, DEST_KEY, etc) and declares the 
  index-time extraction to be evaluator-based.
* The expression takes a similar format to the search-time "|eval" command.
  For example "a=b+c*d" Just like the search-time operator, you can
  string multiple expressions together, separated by commas like
  "len=length(_raw), length_category=floor(log(len,2))".
* Keys which are commonly used with DEST_KEY or SOURCE_KEY (like
  "_raw", "queue", etc) can be used directly in the expression.
  Also available are values which would be populated by default when
  this event is searched ("source", "sourcetype", "host", "splunk_server",
  "linecount", "index"). Search-time calculated fields (the "EVAL-" settings
  in props.conf) are NOT available.
* When INGEST_EVAL accesses the "_time" variable, subsecond information is 
  included. This is unlike regular-expression-based index-time extractions, 
  where  "_time" values are limited to whole seconds.
...

yannK
Splunk Employee
Splunk Employee

A regex will not be able to do subtractions for you.
It seems that the only method is to use a scripted input that will parse the events before indexing.

martin_mueller
SplunkTrust
SplunkTrust

Using scripted inputs to do the conversion means I need to re-implement the handling of log rotations and correct tailing after restarts, right?

I was hoping to get around that with some kind of more-powerful-than-sed pre-processing at index time.

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...