Getting Data In

How to set date & time stamps across two lines in xml where time was already picked up

david_rea
Explorer

Hi Team

Trying to ingest an xml file in the following raw format(extracted portion for sample but each event consists of much more values)

<response><row><row _id="1767282" _uuid="0D981036-9B9C-4841-969E-1DC5755039CC" _position="1767282" _address="http://data.montgomerycountymd.gov/resource/_4mse-ku6q/1767282"><date_of_stop>2015-08-08T00:00:00</date_of_stop><time_of_stop>23:58:00</time_of_stop><agency>MCP</agency>

I have line_breaked based on LINE_BREAKER row/s & used the TIME_PREFIX to date_of_stop. The events are broken nicely but the time value is picking up the value immediately following the date value which does not reflect the desired time_of_stop.

The following is the specified props.conf:

[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=\<date_of_stop\>
MAX_TIMESTAMP_LOOKAHEAD=85
TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

When I run the preview of the data input however, I get the following results:
alt text

I also tried to utilise the datetime.xml approach with a dummy mydatetime.xml & added the following lines based on reading other answers:
<?xml version="1.0" encoding="UTF-8"?>

 <mydatetime>
 <define name="Date" extract="year, month, day">
     <text>date_of_stop>(\d{4})-(\d{2})-(\d{2})</text>
 </define>

 <define name="Time" extract="hour, minute, second">
     <text>time_of_stop>(\d{2}):(\d{2}):(\d{2})</text>
 </define>

 <timePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </timePatterns>

 <datePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </datePatterns>
 </mydatetime>

And updated the props.conf accordingly:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
# TIME_PREFIX=\<date_of_stop\>
# MAX_TIMESTAMP_LOOKAHEAD=85
# TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

But then receive the following when viewing a preview of the data set (after a reboot of Splunk services)
alt text

I've reviewed all the splunk answers to questions related & applied a number of other variations but have not had a successful result where Splunk bypasses the time automatically detected starlight after the date_of_stop value. I'm not convinced either that the datetime.xml is the right approach but it is where the documentation has led me to this point.

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

View solution in original post

david_rea
Explorer

Thanks rschoensee! Removing Regex switches from the TIME_FORMAT resulted in getting the timestamps across multiple lines.

Props.conf
 TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

somesoni2
SplunkTrust
SplunkTrust

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

david_rea
Explorer

Thanks Somesoni2 but tried this & did not resolve the required time to associate with the event. Update the props.conf to reflect your suggestions as per:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
# DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
# TIME_FORMAT=%Y-%m-%dT\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
# TIME_FORMAT=%Y-%m-%dT00:00:00\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=57

But resulting preview when uploading the data after a restart of Splunk shows the same result as defined advice prior to introducing the datetime.xml. The message when looking at event alert is:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

david_rea
Explorer

I also tried with the newline option but same result
syntax

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>\n<time_of_stop>%H:%M:%S
0 Karma

rschoensee
Explorer

Just use the strptime Format %n instead of the regex \n in the TIME_FORMAT String.

TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S

david_rea
Explorer

I also tried with tags included as per your suggestion
Props.conf

TIME_FORMAT="%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S"

TIME_FORMAT="%Y-%m-%dT00:00:00"</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT="%Y-%m-%d"T00:00:00</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT=%Y-%m-%dT00:00:00"</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

TIME_FORMAT=%Y-%m-%d"T00:00:00</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

Still the same strptime message.

0 Karma

rschoensee
Explorer

Try it without using any double quote

david_rea
Explorer

Bingo! Thanks rschoensee!

I used the following syntax:
props.conf

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

This has provided the timestamp capture stated at the later point.

0 Karma

david_rea
Explorer

Thanks rschoensee, unfortunately getting the same result. I've tried the following entries in props.conf:
TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n<time_of_stop>%H:%M:%S

In both cases, I still receive the stated message:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

rschoensee
Explorer

Hi David,
in your
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n\ %H:%M:%S
the "" part is missing - and please stop using backslashes here
just exactly use

 TIME_FORMAT = %Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

david_rea
Explorer

Thanks again rschoensee

I've tried the following combinations in the props.conf:
Props.conf entries
TIME_FORMAT="%Y-%m-%dT00:00:00%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d"T00:00:00"%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d""T00:00:00""%n%H:%M:%S”
TIME_FORMAT=%Y-%m-%d"T00:00:00"%n%H:%M:%S
TIME_FORMAT="%Y-%m-%d"T00:00:00%n"%H:%M:%S"

All modifications have resulted in the same message when looking at the data input preview

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

Can you please tell me where exactly the double quotes should be positioned? Seems like a silly question but my searches through other Splunk answers have not help me find this. Eagerly looking to resolving this & appreciate your help, thanks

0 Karma

david_rea
Explorer

BTW, I used the code sample option to show double quotes

0 Karma

david_rea
Explorer

FYI on other props.conf syntax attempted with TIME_FORMAT:
Syntax

TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...