I have an input that writes timestamps as the number of milliseconds passed since January 1st 1601 that sadly cannot be changed to either human-readable or a Unix timestamp.
For example, 12995561169293 corresponds to October 24th 2012, 14:06:09. Splunk interprets this as a Unix timestamp, treating the last four digits as milliseconds and 100 microseconds: 1299556116.929(3) corresponding to March 8th 2011, 04:48:36.929.
I can convert "my" timestamp into a Unix timestamp by substracting a constant with an external preprocessing application before loading a file into Splunk. However, I'd prefer it if I could teach Splunk to understand it directly.
The usual sed/regex-transformations at index time cannot do maths to subtract the offset, is there any other way to do the conversion within Splunk?
A regex will not be able to do subtractions for you.
It seems that the only method is to use a scripted input that will parse the events before indexing.
Using scripted inputs to do the conversion means I need to re-implement the handling of log rotations and correct tailing after restarts, right?
I was hoping to get around that with some kind of more-powerful-than-sed pre-processing at index time.
Just under six years later, 7.2 promises a fix \o/
INGEST_EVAL = <comma-separated list of evaluator expressions> * NOTE: This setting is only valid for index-time field extractions. * Optional. When you set INGEST_EVAL, this setting overrides all of the other index-time settings (such as REGEX, DEST_KEY, etc) and declares the index-time extraction to be evaluator-based. * The expression takes a similar format to the search-time "|eval" command. For example "a=b+c*d" Just like the search-time operator, you can string multiple expressions together, separated by commas like "len=length(_raw), length_category=floor(log(len,2))". * Keys which are commonly used with DEST_KEY or SOURCE_KEY (like "_raw", "queue", etc) can be used directly in the expression. Also available are values which would be populated by default when this event is searched ("source", "sourcetype", "host", "splunk_server", "linecount", "index"). Search-time calculated fields (the "EVAL-" settings in props.conf) are NOT available. * When INGEST_EVAL accesses the "_time" variable, subsecond information is included. This is unlike regular-expression-based index-time extractions, where "_time" values are limited to whole seconds. ...