Getting Data In

How to properly parse timestamps in Splunk for some text log files in a directory that have dates without a year?

peter_gianusso
Communicator

I have seen somewhat similar issues on here, but none that meet my situation.

I have a directory on a Windows server with multiple textual log files in it.

Some of the log files on that server in that directory have a date format of MM/DD/YY mm:hh:ss. Splunk parses that just fine.

Some of the log files on that same server in that same directory have a date format of MM/DD mm:hh:ss. Splunk parses that horribly!! Ends up with the wrong year.

Now obviously the log file needs to be written correctly with the full date/time including the year but that's not an overnight fix.

I saw one solution for a similar issue talk about putting things in the props.conf, but I thought you would need to put that in at the host level and obviously I have multiple situations in the same directory on the same host.

Any help would be appreciated.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

You may want to use something like this in your props.conf stanza for the sourcetype:

SEDCMD-addyear = s/([0-9]+\/[0-9]+ )/$1\/2014/

which will add the year (granted, the wrong year starting Jan 1) to the timestamp. I suggest you look at a good way to do the sed command so that it will only do the first occurrence of the timestamp and try to do the year correctly. At least this is a starting point. If you have data from before the beginning of this year, that will probably not work, since the year would have to be explicitly defined to work.

peter_gianusso
Communicator

Thanks.

I'd prefer not to hard code stuff
The bizarre part is that I when I manual load the file into Splunk and get to the "Set Sourcetype" screen, it actually does add the year correctly onto it and I end up with a full date of 11/13/14.

Weird.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Are these files old files you are bringing in, or you just want the new data coming in to be correctly timestamped?

If the former, I don't know how you can do it without changing the file contents before indexing, or using the SEDCMD from above.

If the latter, just use the file modifcation time:

DATETIME_CONFIG=NONE

From the Splunk documentation:

Set DATETIME_CONFIG = NONE to prevent
the timestamp processor from running.
When timestamp processing is off,
Splunk Enterprise does not look at the
text of the event for the
timestamp--it instead uses the event's
"time of receipt"; in other words, the
time the event is received via its
input. For file-based inputs, this
means that Splunk Enterprise derives
the event timestamp from the
modification time of the input file.

That may mean that your timestamp may be slightly off (seconds), but it is better than the wrong year. You would probably want to have a zero length file when you start if you don't want all the events in the file already to have the same timestamp that would be the current mod time of the file.

0 Karma

peter_gianusso
Communicator

i just want newly indexed data to be correctly timestamped.

I figured out that props.conf can be configured to look at only certain sourcetypes. So I added a section for the specific file's source type and added TIME_FORMAT = %m/%d %h:%m:%s

It looks like that doesn't hurt anything.

Bizarrely, it looks like Splunk was processing the date time correctly until 10/31. After 10/31 it did not processing any events correctly until today.

It started processing the date/time stamp correctly today so I can't tell if what I did works or your suggestion will work!!!

Thanks for your help!!

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...