Getting Data In

TIME_FORMAT after overrided source type on a per-event basis

bever
Explorer

Hello,

I have a file exampleFile that has two different timestamp/event formats:

~02 07 10:19:24 OIT-FO-OFR2 NSSTRAP 

and

Feb 05 18:58:43 ANSU-OPS-2 checkpn: OK:ABM3CANDAF34018

As both timestamps do not contain the year, splunk does not manage to correctly index the events.

I therefore override both sourcetypes on a per-event basis.

In props.conf:

[source::.../exampleFile]
TRANSFORMS-event_1 = event_1
TRANSFORMS-event_2 = event_2

[FORMAT_1]
NO_BINARY_CHECK = 1
TIME_FORMAT =%b %d %H:%M:%S

[FORMAT_2]
NO_BINARY_CHECK = 1
TIME_PREFIX =^\~
TIME_FORMAT =%m %d %H:%M:%S

In transforms.conf:

[event_1]
REGEX = \w{3}\s\d{2}\s\d{2}\:\d{2}\:\d{2}\s.+
FORMAT = sourcetype::FORMAT_1
DEST_KEY = MetaData:Sourcetype

[event_2]
REGEX = \~\d{2}\s\d{2}\s\d{2}\:\d{2}\:\d{2}\s.+
FORMAT = sourcetype::FORMAT_2
DEST_KEY = MetaData:Sourcetype

This works, the sourcetype is correctly assigned to each type, but the indexed timestamps stay wrong.

Any ideas on how I can correctly assign the TIME_FORMAT to the per-event overrided sourcetype?

PS: When I upload a file only containing one event format, and when I assign this file directly to a sourcetype FORMAT_1 or FORMAT_2, the TIME_FORMAT definition works correctly

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

To avoid that order issue altogether you can do this in props.conf:

[both_formats]
DATETIME_CONFIG=/etc/system/local/mydatetime.xml
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=false

Apply that sourcetype to the entire file, no transforms.conf shenanigans. The content of mydatetime.xml is as follows:

<datetime>
  <define name="_year" extract="year">
    <text><![CDATA[(20\d\d|19\d\d|[901]\d(?!\d))]]></text>
  </define>
  <define name="_month" extract="month">
    <text><![CDATA[(0?[1-9]|1[012])(?!:)]]></text>
  </define>
  <define name="_litmonth"  extract="litmonth">
    <text><![CDATA[(?<![\d\w])(jan|\x{3127}\x{6708}|feb|\x{4E8C}\x{6708}|mar|\x{4E09}\x{6708}|apr|\x{56DB}\x{6708}|may|\x{4E94}\x{6708}|jun|\x{516D}\x{6708}|jul|\x{4E03}\x{6708}|aug|\x{516B}\x{6708}|sep|\x{4E5D}\x{6708}|oct|\x{5341}\x{6708}|nov|\x{5341}\x{3127}\x{6708}|dec|\x{5341}\x{4E8C}\x{6708})[a-z,\.;]*]]></text>
  </define>
  <define name="_day"  extract="day">
    <text><![CDATA[(0?[1-9]|[12]\d|3[01])]]></text> 
  </define>
  <define name="_hour" extract="hour">
    <text><![CDATA[([01]?[1-9]|[012][0-3])(?!\d)]]></text>
  </define>
  <define name="_minute" extract="minute">
    <text><![CDATA[([0-6]\d)(?!\d)]]></text>
  </define>
  <define name="_second" extract="second">
    <text><![CDATA[([0-6]\d)(?!\d)]]></text>
  </define>
  <define name="format_1" extract="litmonth, day, hour, minute, second">
    <text><![CDATA[(\w\w\w) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
  </define>
  <define name="format_2" extract="month, day, hour, minute, second">
    <text><![CDATA[(\d\d?) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
  </define>
  <timePatterns>
    <use name="format_1"/>
    <use name="format_2"/>
  </timePatterns>
  <datePatterns>
  </datePatterns>
</datetime>

Take a look at $SPLUNK_HOME/etc/datetime.xml for the default version.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

To avoid that order issue altogether you can do this in props.conf:

[both_formats]
DATETIME_CONFIG=/etc/system/local/mydatetime.xml
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=false

Apply that sourcetype to the entire file, no transforms.conf shenanigans. The content of mydatetime.xml is as follows:

<datetime>
  <define name="_year" extract="year">
    <text><![CDATA[(20\d\d|19\d\d|[901]\d(?!\d))]]></text>
  </define>
  <define name="_month" extract="month">
    <text><![CDATA[(0?[1-9]|1[012])(?!:)]]></text>
  </define>
  <define name="_litmonth"  extract="litmonth">
    <text><![CDATA[(?<![\d\w])(jan|\x{3127}\x{6708}|feb|\x{4E8C}\x{6708}|mar|\x{4E09}\x{6708}|apr|\x{56DB}\x{6708}|may|\x{4E94}\x{6708}|jun|\x{516D}\x{6708}|jul|\x{4E03}\x{6708}|aug|\x{516B}\x{6708}|sep|\x{4E5D}\x{6708}|oct|\x{5341}\x{6708}|nov|\x{5341}\x{3127}\x{6708}|dec|\x{5341}\x{4E8C}\x{6708})[a-z,\.;]*]]></text>
  </define>
  <define name="_day"  extract="day">
    <text><![CDATA[(0?[1-9]|[12]\d|3[01])]]></text> 
  </define>
  <define name="_hour" extract="hour">
    <text><![CDATA[([01]?[1-9]|[012][0-3])(?!\d)]]></text>
  </define>
  <define name="_minute" extract="minute">
    <text><![CDATA[([0-6]\d)(?!\d)]]></text>
  </define>
  <define name="_second" extract="second">
    <text><![CDATA[([0-6]\d)(?!\d)]]></text>
  </define>
  <define name="format_1" extract="litmonth, day, hour, minute, second">
    <text><![CDATA[(\w\w\w) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
  </define>
  <define name="format_2" extract="month, day, hour, minute, second">
    <text><![CDATA[(\d\d?) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
  </define>
  <timePatterns>
    <use name="format_1"/>
    <use name="format_2"/>
  </timePatterns>
  <datePatterns>
  </datePatterns>
</datetime>

Take a look at $SPLUNK_HOME/etc/datetime.xml for the default version.

martin_mueller
SplunkTrust
SplunkTrust

Timestamping happens before the transforms rules, so the sourcetype gets set too late.

Some quite informative diagrams are here: http://wiki.splunk.com/Community:HowIndexingWorks

Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In the last month, the Splunk Threat Research Team (STRT) has had 2 releases of new security content via the ...

Announcing the 1st Round Champion’s Tribute Winners of the Great Resilience Quest

We are happy to announce the 20 lucky questers who are selected to be the first round of Champion's Tribute ...

We’ve Got Education Validation!

Are you feeling it? All the career-boosting benefits of up-skilling with Splunk? It’s not just a feeling, it's ...