Getting Data In

TIME_FORMAT around midnight

marcelofinki
Explorer

Hi, I need help specifying a TIME_FORMAT in my props.conf file

My Log file (OS=Windows) contains date-times like these:

1 - 9/22/2010 23:36:33 PM - CC Housekeeping : Leaving Log All Manual Intervention Pending Payments
1 - 9/22/2010 23:36:33 PM - CC Housekeeping : Leaving ExportBatch
2 - 9/22/2010 23:36:33 PM - CC Housekeeping : Disconnecting from database
1 - 9/22/2010 23:36:33 PM - CC Housekeeping : Mediator has finished main processing
1 - 9/22/2010 23:36:33 PM - CC Housekeeping : Call to Shutdown on objects made
5 - 9/23/2010 0:05:30 AM - CC Housekeeping : Starting Mediator with Debug Mode of  : Proc...
1 - 9/23/2010 0:05:30 AM - CC Housekeeping : Test Mediator
1 - 9/23/2010 0:05:30 AM - CC Housekeeping : Performance Monitor Counters added

Splunk (OS = Windows) interprets properly the 23 (11 PM) rows, but does not recognize this (0:05:30 AM) as (00:05:30 AM).

Sadly, it interprets those times as (05:30:00 AM), five thirty in the morning, instead of zero hours, 5 minutes, 30 seconds.

I believe i need to define a TIME_FORMAT stanza in my props.conf file but I do not know how to sepcify the hour portion of this format.

Is this correct? %m/%d/%Y %k:%M:%S %p

How can I specify that the hours are not preceded by a leading zero?

Hours in my log file range like this:

0:mm:ss AM  mm minutes after midnight
1:mm:ss AM  one in the morning
9:mm:ss AM  nine in the morning
11:59:59 AM  almost noon
12:00:01 PM  one second afternoon
14:06:02 PM  two hours 6 minutes 2 seconds in the afternoon.
23:59:59 PM  almost midnight.

Thanks in advance, Marcelo Finkielsztein

mzax
Splunk Employee
Splunk Employee

Please edit $SPLUNK_HOME/etc/datetime.xml in the hour extraction: Current:

 <define name="_hour" extract="hour">
 <text><![CDATA[([01]?[1-9]|[012][0-3])(?!\d)]]></text>
 </define>

Change to:

 <define name="_hour" extract="hour">
 <text><![CDATA[([01]?[0-9]|[012][0-3])(?!\d)]]></text>
 </define>

laserval
Communicator

This is the case for Splunk 6.1.x as well.

0 Karma

marcelofinki
Explorer

Good News Everyone,
Changing datetime.xml fixed the issue.
Just FYI, other things (like (time_prefix, break_only_before, etc)have not worked for me.
Thank You.
Marcelo

0 Karma

mitch_1
Splunk Employee
Splunk Employee

As Stephen said above you want to have a TIME_PREFIX so TIME_FORMAT knows where to start looking for a timestamp.

If TIME_FORMAT can't parse the timestamp at the beginning of the selected text (i.e. the beginning of the line after stripping TIME_PREFIX off) it will fail, and fall back to the built-in heuristics. Based on your failure case, it seems you're almost certainly in that state -- the heuristics are finding the "05:30 AM" and assuming that's the time. The unusual combination of the 24-hour time followed by AM/PM is just confusing it.

The reason it falls back to the heuristics is to try to find timestamps even if there are some lines that don't follow the TIME_FORMAT rule (some broken file formats sometimes have inconsistent time formats) It arguably should be more verbose when it does this, since sometimes you end up in this sort of situation where its reasoning seems pretty opaque.

So just add the TIME_PREFIX (mzax's suggested value looks good to me) and the TIME_FORMAT should take over. I also suggest not including the "%p", since the AM/PM is superfluous given the 24-hour format.

0 Karma

marcelofinki
Explorer

Thank you Stephen,

Over the weekend or on Monday I am going to try it, even when according to Unix documentation

%H     hour (00..23)
%k     hour ( 0..23)

Today I tried %k, but it did not work for me. Splunk kept mistaking the 0 hours. When the time is 1:00 AM or later, Splunk recognizes it OK, but from 00:00:01 to 0:59:59 i am in trouble.

Thanks for your suggestion.
Marcelo

0 Karma

marcelofinki
Explorer

I have just crafted a log file that contains "00" for the hour and Splunk indexed it properly.
Besides, i have a sample file with hours expressed as "0:mm:ss" that Splunk cannot recognize properly.
Once it is 1 AM, timestamps get parsed correctly, but from midnight to 0:59:59, they don't.

I can send samples of these files and my diag output if needed.

I could also set up a webex session for you to visit my server.

Please let me know.

THANK YOU,
Marcelo

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

I have just tested it with our internal tool to test TIME_FORMATs and %H doesn't require a leading 0 for single digit hours.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I have seen a lot of people post in answers and in the old forums about troubles with Splunk reading a single-digit '0' as a valid hour, but I have not encountered it myself.

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

You should be fine with a configuration in props.conf like:

[<sourcetype>]
TIME_PREFIX = (?=\d+/\d+/\d{4} \d)
TIME_FORMAT = %m/%d/%Y %H:%M:%S

You need the prefix to bring you to the beginning of the time string, assuming it's not at the beginning of the line. The %p isn't necessary if your data is on a 24 hour clock.

0 Karma

mzax
Splunk Employee
Splunk Employee

Marcelo, Your file doesn't have new line, so you need to use the BREAK_ONLY_BEFORE to break it in the correct place.
The the TIME_FORAMT will work.
Correct configuration in props.conf will be:
BREAK_ONLY_BEFORE = \d+\s-\s
TIME_PREFIX = \d+\s-\s
TIME_FORMAT = %m/%d/%Y %H:%M:%S

0 Karma

marcelofinki
Explorer

(continuation)
Splunk recognized correctly event until 23:59:59 and vents after 1:00:00 AM.
For events with a timestamp of 0:mm:ss (midnight to 1 AM) splunk included all of them (about 250 lines of the file) in ONE /multiline/ event (!?).

When i issue this search:
index="sampleindex" starttime=9/22/2010:23:36:32 endtime=9/23/2010:01:05:31 | sort _time

I can see clearly one line per event, except for a "chunk" containing 257 lines ranging from 23:36:33 (last entry before midnight) and containing all the 0:hh:mm lines stuck to that last line after midnight.

Any ideas? THANKS in advance

0 Karma

marcelofinki
Explorer

Hi again!

Not sure why you used that time_prefix; i assume it is just a sample.
In my case, in front of the date-time i have a number from 1 to 10 that identifies "severity" of the event, then a dash adn then the date-time.
i have inserted this in the props.conf file:

[sample_time_format]
TIME_PREFIX=\d+\s-\s
TIME_FORMAT=%m/%d/%Y %H:%M:%S

The prefix regex means: one or more digits, then a space, then a dash, then a space.
Hope this is correct, please confirm.

On the time_format line i have tested both %H and %k, both without success.
(to be continued)

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...