Hello,
I have a file exampleFile that has two different timestamp/event formats:
~02 07 10:19:24 OIT-FO-OFR2 NSSTRAP
and
Feb 05 18:58:43 ANSU-OPS-2 checkpn: OK:ABM3CANDAF34018
As both timestamps do not contain the year, splunk does not manage to correctly index the events.
I therefore override both sourcetypes on a per-event basis.
In props.conf:
[source::.../exampleFile]
TRANSFORMS-event_1 = event_1
TRANSFORMS-event_2 = event_2
[FORMAT_1]
NO_BINARY_CHECK = 1
TIME_FORMAT =%b %d %H:%M:%S
[FORMAT_2]
NO_BINARY_CHECK = 1
TIME_PREFIX =^\~
TIME_FORMAT =%m %d %H:%M:%S
In transforms.conf:
[event_1]
REGEX = \w{3}\s\d{2}\s\d{2}\:\d{2}\:\d{2}\s.+
FORMAT = sourcetype::FORMAT_1
DEST_KEY = MetaData:Sourcetype
[event_2]
REGEX = \~\d{2}\s\d{2}\s\d{2}\:\d{2}\:\d{2}\s.+
FORMAT = sourcetype::FORMAT_2
DEST_KEY = MetaData:Sourcetype
This works, the sourcetype is correctly assigned to each type, but the indexed timestamps stay wrong.
Any ideas on how I can correctly assign the TIME_FORMAT to the per-event overrided sourcetype?
PS: When I upload a file only containing one event format, and when I assign this file directly to a sourcetype FORMAT_1 or FORMAT_2, the TIME_FORMAT definition works correctly
To avoid that order issue altogether you can do this in props.conf:
[both_formats]
DATETIME_CONFIG=/etc/system/local/mydatetime.xml
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=false
Apply that sourcetype to the entire file, no transforms.conf shenanigans. The content of mydatetime.xml is as follows:
<datetime>
<define name="_year" extract="year">
<text><![CDATA[(20\d\d|19\d\d|[901]\d(?!\d))]]></text>
</define>
<define name="_month" extract="month">
<text><![CDATA[(0?[1-9]|1[012])(?!:)]]></text>
</define>
<define name="_litmonth" extract="litmonth">
<text><![CDATA[(?<![\d\w])(jan|\x{3127}\x{6708}|feb|\x{4E8C}\x{6708}|mar|\x{4E09}\x{6708}|apr|\x{56DB}\x{6708}|may|\x{4E94}\x{6708}|jun|\x{516D}\x{6708}|jul|\x{4E03}\x{6708}|aug|\x{516B}\x{6708}|sep|\x{4E5D}\x{6708}|oct|\x{5341}\x{6708}|nov|\x{5341}\x{3127}\x{6708}|dec|\x{5341}\x{4E8C}\x{6708})[a-z,\.;]*]]></text>
</define>
<define name="_day" extract="day">
<text><![CDATA[(0?[1-9]|[12]\d|3[01])]]></text>
</define>
<define name="_hour" extract="hour">
<text><![CDATA[([01]?[1-9]|[012][0-3])(?!\d)]]></text>
</define>
<define name="_minute" extract="minute">
<text><![CDATA[([0-6]\d)(?!\d)]]></text>
</define>
<define name="_second" extract="second">
<text><![CDATA[([0-6]\d)(?!\d)]]></text>
</define>
<define name="format_1" extract="litmonth, day, hour, minute, second">
<text><![CDATA[(\w\w\w) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
</define>
<define name="format_2" extract="month, day, hour, minute, second">
<text><![CDATA[(\d\d?) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
</define>
<timePatterns>
<use name="format_1"/>
<use name="format_2"/>
</timePatterns>
<datePatterns>
</datePatterns>
</datetime>
Take a look at $SPLUNK_HOME/etc/datetime.xml
for the default version.
To avoid that order issue altogether you can do this in props.conf:
[both_formats]
DATETIME_CONFIG=/etc/system/local/mydatetime.xml
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=false
Apply that sourcetype to the entire file, no transforms.conf shenanigans. The content of mydatetime.xml is as follows:
<datetime>
<define name="_year" extract="year">
<text><![CDATA[(20\d\d|19\d\d|[901]\d(?!\d))]]></text>
</define>
<define name="_month" extract="month">
<text><![CDATA[(0?[1-9]|1[012])(?!:)]]></text>
</define>
<define name="_litmonth" extract="litmonth">
<text><![CDATA[(?<![\d\w])(jan|\x{3127}\x{6708}|feb|\x{4E8C}\x{6708}|mar|\x{4E09}\x{6708}|apr|\x{56DB}\x{6708}|may|\x{4E94}\x{6708}|jun|\x{516D}\x{6708}|jul|\x{4E03}\x{6708}|aug|\x{516B}\x{6708}|sep|\x{4E5D}\x{6708}|oct|\x{5341}\x{6708}|nov|\x{5341}\x{3127}\x{6708}|dec|\x{5341}\x{4E8C}\x{6708})[a-z,\.;]*]]></text>
</define>
<define name="_day" extract="day">
<text><![CDATA[(0?[1-9]|[12]\d|3[01])]]></text>
</define>
<define name="_hour" extract="hour">
<text><![CDATA[([01]?[1-9]|[012][0-3])(?!\d)]]></text>
</define>
<define name="_minute" extract="minute">
<text><![CDATA[([0-6]\d)(?!\d)]]></text>
</define>
<define name="_second" extract="second">
<text><![CDATA[([0-6]\d)(?!\d)]]></text>
</define>
<define name="format_1" extract="litmonth, day, hour, minute, second">
<text><![CDATA[(\w\w\w) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
</define>
<define name="format_2" extract="month, day, hour, minute, second">
<text><![CDATA[(\d\d?) (\d\d?) (\d\d):(\d\d):(\d\d)]]></text>
</define>
<timePatterns>
<use name="format_1"/>
<use name="format_2"/>
</timePatterns>
<datePatterns>
</datePatterns>
</datetime>
Take a look at $SPLUNK_HOME/etc/datetime.xml
for the default version.
Timestamping happens before the transforms rules, so the sourcetype gets set too late.
Some quite informative diagrams are here: http://wiki.splunk.com/Community:HowIndexingWorks