I'm trying to load one of my logs from my phone server into Splunk. Splunk will read the log file and break the events correctly.
My problem is with the date that Splunk places on the event. The event in my log has a time but no date however the date is in the filename. When I look at the loaded events in Splunk the time is extracted properly however the date is the next day (which is because of the modified date on the file).
I've copied the existing datetime.xml into /etc/system/local and added an extraction for my date from my filename. How can I pull the date from the filename into Splunk as the date on my event?
File name: d:...\VMail-110212-000000.Log
Raw event looks like from (d:...\VMail-110212-000000.Log but shows date as 2/13/11 because of file modified date):
23:59:57.334 ( 4172: 5540) [MS] Entering Process MWI loop 23:59:57.334 ( 4172: 5540) [MS] Process MWI loop return, no more entries left
Datetime.xml:
<datetime>
<define name="_masheddate3" extract="year, month, day">
<text><![CDATA[source::.*?\\Vmail-\d{2}\d{2}\d{2}.*\.Log]]></text>
</define>
...
<datepatterns>
...
<use name="_masheddate3"/>
</datepatterns>
</datetime>
Props.conf
[ShoreTel_VMail]
SHOULD_LINEMERGE = FALSE
DATETIME_CONFIG = \etc\system\local\datetime.xml
TRANSFORMS-shoretel-comment = shoretel_comment
Inputs.conf
[monitor://D:\Sample Logs\Shoretel_Current]
disabled=false
host=Shoretel
sourcetype=ShoreTel_VMail
crcSalt=<SOURCE>
index=shoretel
This occurs because the time stamp of the events in your source file is incomplete (has a time, doesn't have a date) which leaves Splunk to do some amount of guessing as to what date to assign to the new event.
In your case, I am fairly certain that the following occurs :
As a result, Splunk will index the file as if it spanned over several days. Note that this will also occur if there is a discontinuity in the time of subsequent events. For example, an event recorded with a time of "10:28:59" following an event with a time of "10:29:12" will also trigger an increase of 1 for the value of date_mday (the internal field where Splunk stores the value of the day of the month).
As of 4.1.x, there is no way to prevent this from happening when you are indexing historical data (i.e files that contain events not from the current day).
If you are indexing current events, you can add the parameter "MAX_DAYS_HENCE = 0" to the props.conf stanza for this source/sourcetype to prevent Splunk from increasing the value of date_mday beyond the current day.
In 4.2, Splunk will honor the MAX_DIFF_SECS_AGO parameter even for incomplete time stamps (which is not the case in 4.1.x) and you will be able to use that parameter to prevent date_mday increases for both current and historical files. From props.conf.spec :
MAX_DIFF_SECS_AGO = <integer>
* If the event's timestamp is more than <integer> seconds BEFORE the previous timestamp, only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, consider increasing this value.
* Defaults to 3600 (one hour).
This occurs because the time stamp of the events in your source file is incomplete (has a time, doesn't have a date) which leaves Splunk to do some amount of guessing as to what date to assign to the new event.
In your case, I am fairly certain that the following occurs :
As a result, Splunk will index the file as if it spanned over several days. Note that this will also occur if there is a discontinuity in the time of subsequent events. For example, an event recorded with a time of "10:28:59" following an event with a time of "10:29:12" will also trigger an increase of 1 for the value of date_mday (the internal field where Splunk stores the value of the day of the month).
As of 4.1.x, there is no way to prevent this from happening when you are indexing historical data (i.e files that contain events not from the current day).
If you are indexing current events, you can add the parameter "MAX_DAYS_HENCE = 0" to the props.conf stanza for this source/sourcetype to prevent Splunk from increasing the value of date_mday beyond the current day.
In 4.2, Splunk will honor the MAX_DIFF_SECS_AGO parameter even for incomplete time stamps (which is not the case in 4.1.x) and you will be able to use that parameter to prevent date_mday increases for both current and historical files. From props.conf.spec :
MAX_DIFF_SECS_AGO = <integer>
* If the event's timestamp is more than <integer> seconds BEFORE the previous timestamp, only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, consider increasing this value.
* Defaults to 3600 (one hour).
Thanks for the explanation and help hexx.