Solved: Special epoch timestamp recognition

chris · ‎08-07-2013

Can Splunk somehow recognize the following timestamp format: 1.375944219123E9

It is the epoch time written in float exponential notation with milliseconds

I did not see that strptime supports this format

I then thought that I could just get rid of the "." using SEDCMD in props.conf but it is only executed after the timestamp recognition

Aaand I think it is not possible to write a custom datetime.xml that somehow drops the "." because it is regex based and you can't just drop the "." in a capturing group, but I might be wrong.

Any Ideas? Also a confirmation that it is not possible to read this format would help.

Thanks

Chris

kristian_kolb · ‎08-08-2013

Hmm. Tricky. Is it possible to make a TRANSFORM on _time?

It would require that you set your TIME_PREFIX to include the "1."-part of the timestamp. And then set the TIME_FORMAT to %s%3N. That would give you a timestamp in early 1980's.

Then with a TRANSFORM on _time you add the starting "1" (and perhaps remove the millisecond part)...

I have not done this before, so take it as a theory that might be worth investigating. Perhaps this is not at all possible. It's at times like these I wish I knew all of the parsing/indexing phase processors by heart, and in which order they come. 🙂

From the docs on transforms.conf:

FORMAT = <string>
* NOTE: This option is valid for both index-time and search-time field extraction. However, FORMAT 
  behaves differently depending on whether the extraction is performed at index time or 
  search time.
* This attribute specifies the format of the event, including any field names or values you want 
  to add.
* FORMAT for index-time extractions:
    * Use $n (for example $1, $2, etc) to specify the output of each REGEX match. 
    * If REGEX does not have n groups, the matching fails. 
    * The special identifier $0 represents what was in the DEST_KEY before the REGEX was performed.
    * At index time only, you can use FORMAT to create concatenated fields:
        * FORMAT = ipaddress::$1.$2.$3.$4
    * When you create concatenated fields with FORMAT, "$" is the only special character. It is 
      treated as a prefix for regex-capturing groups only if it is followed by a number and only 
      if the number applies to an existing capturing group. So if REGEX has only one capturing 
      group and its value is "bar", then:
        * "FORMAT = foo$1" yields "foobar"
        * "FORMAT = foo$bar" yields "foo$bar"
        * "FORMAT = foo$1234" yields "foo$1234"
        * "FORMAT = foo$1\$2" yields "foobar\$2"
    * At index-time, FORMAT defaults to <stanza-name>::$1

Hope this helps a little,

K

Update:

Hi K,
you sent me in the right direction. This works without the subsecond part:

props.conf
[epo]
TRANSFORMS-epo=epo

transforms.conf
[epo]
DEST_KEY = _time
#This doesn't work
#REGEX =  (1)\.(\d{12})E9
#This does work
REGEX =  (1)\.(\d{9})\d{3}E9
FORMAT = $1$2

View solution in original post

kristian_kolb · ‎08-08-2013

Hmm. Tricky. Is it possible to make a TRANSFORM on _time?

It would require that you set your TIME_PREFIX to include the "1."-part of the timestamp. And then set the TIME_FORMAT to %s%3N. That would give you a timestamp in early 1980's.

Then with a TRANSFORM on _time you add the starting "1" (and perhaps remove the millisecond part)...

I have not done this before, so take it as a theory that might be worth investigating. Perhaps this is not at all possible. It's at times like these I wish I knew all of the parsing/indexing phase processors by heart, and in which order they come. 🙂

From the docs on transforms.conf:

FORMAT = <string>
* NOTE: This option is valid for both index-time and search-time field extraction. However, FORMAT 
  behaves differently depending on whether the extraction is performed at index time or 
  search time.
* This attribute specifies the format of the event, including any field names or values you want 
  to add.
* FORMAT for index-time extractions:
    * Use $n (for example $1, $2, etc) to specify the output of each REGEX match. 
    * If REGEX does not have n groups, the matching fails. 
    * The special identifier $0 represents what was in the DEST_KEY before the REGEX was performed.
    * At index time only, you can use FORMAT to create concatenated fields:
        * FORMAT = ipaddress::$1.$2.$3.$4
    * When you create concatenated fields with FORMAT, "$" is the only special character. It is 
      treated as a prefix for regex-capturing groups only if it is followed by a number and only 
      if the number applies to an existing capturing group. So if REGEX has only one capturing 
      group and its value is "bar", then:
        * "FORMAT = foo$1" yields "foobar"
        * "FORMAT = foo$bar" yields "foo$bar"
        * "FORMAT = foo$1234" yields "foo$1234"
        * "FORMAT = foo$1\$2" yields "foobar\$2"
    * At index-time, FORMAT defaults to <stanza-name>::$1

Hope this helps a little,

K

Update:

Hi K,
you sent me in the right direction. This works without the subsecond part:

props.conf
[epo]
TRANSFORMS-epo=epo

transforms.conf
[epo]
DEST_KEY = _time
#This doesn't work
#REGEX =  (1)\.(\d{12})E9
#This does work
REGEX =  (1)\.(\d{9})\d{3}E9
FORMAT = $1$2

kristian_kolb · ‎08-13-2013

glad to hear that it worked...

Special epoch timestamp recognition

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor