Getting Data In

How to set date & time stamps across two lines in xml where time was already picked up

david_rea
Explorer

Hi Team

Trying to ingest an xml file in the following raw format(extracted portion for sample but each event consists of much more values)

<response><row><row _id="1767282" _uuid="0D981036-9B9C-4841-969E-1DC5755039CC" _position="1767282" _address="http://data.montgomerycountymd.gov/resource/_4mse-ku6q/1767282"><date_of_stop>2015-08-08T00:00:00</date_of_stop><time_of_stop>23:58:00</time_of_stop><agency>MCP</agency>

I have line_breaked based on LINE_BREAKER row/s & used the TIME_PREFIX to date_of_stop. The events are broken nicely but the time value is picking up the value immediately following the date value which does not reflect the desired time_of_stop.

The following is the specified props.conf:

[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=\<date_of_stop\>
MAX_TIMESTAMP_LOOKAHEAD=85
TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

When I run the preview of the data input however, I get the following results:
alt text

I also tried to utilise the datetime.xml approach with a dummy mydatetime.xml & added the following lines based on reading other answers:
<?xml version="1.0" encoding="UTF-8"?>

 <mydatetime>
 <define name="Date" extract="year, month, day">
     <text>date_of_stop>(\d{4})-(\d{2})-(\d{2})</text>
 </define>

 <define name="Time" extract="hour, minute, second">
     <text>time_of_stop>(\d{2}):(\d{2}):(\d{2})</text>
 </define>

 <timePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </timePatterns>

 <datePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </datePatterns>
 </mydatetime>

And updated the props.conf accordingly:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
# TIME_PREFIX=\<date_of_stop\>
# MAX_TIMESTAMP_LOOKAHEAD=85
# TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

But then receive the following when viewing a preview of the data set (after a reboot of Splunk services)
alt text

I've reviewed all the splunk answers to questions related & applied a number of other variations but have not had a successful result where Splunk bypasses the time automatically detected starlight after the date_of_stop value. I'm not convinced either that the datetime.xml is the right approach but it is where the documentation has led me to this point.

0 Karma
1 Solution

somesoni2
Revered Legend

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

View solution in original post

david_rea
Explorer

Thanks rschoensee! Removing Regex switches from the TIME_FORMAT resulted in getting the timestamps across multiple lines.

Props.conf
 TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

somesoni2
Revered Legend

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

david_rea
Explorer

Thanks Somesoni2 but tried this & did not resolve the required time to associate with the event. Update the props.conf to reflect your suggestions as per:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
# DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
# TIME_FORMAT=%Y-%m-%dT\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
# TIME_FORMAT=%Y-%m-%dT00:00:00\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=57

But resulting preview when uploading the data after a restart of Splunk shows the same result as defined advice prior to introducing the datetime.xml. The message when looking at event alert is:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

david_rea
Explorer

I also tried with the newline option but same result
syntax

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>\n<time_of_stop>%H:%M:%S
0 Karma

rschoensee
Explorer

Just use the strptime Format %n instead of the regex \n in the TIME_FORMAT String.

TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S

david_rea
Explorer

I also tried with tags included as per your suggestion
Props.conf

TIME_FORMAT="%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S"

TIME_FORMAT="%Y-%m-%dT00:00:00"</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT="%Y-%m-%d"T00:00:00</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT=%Y-%m-%dT00:00:00"</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

TIME_FORMAT=%Y-%m-%d"T00:00:00</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

Still the same strptime message.

0 Karma

rschoensee
Explorer

Try it without using any double quote

david_rea
Explorer

Bingo! Thanks rschoensee!

I used the following syntax:
props.conf

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

This has provided the timestamp capture stated at the later point.

0 Karma

david_rea
Explorer

Thanks rschoensee, unfortunately getting the same result. I've tried the following entries in props.conf:
TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n<time_of_stop>%H:%M:%S

In both cases, I still receive the stated message:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

rschoensee
Explorer

Hi David,
in your
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n\ %H:%M:%S
the "" part is missing - and please stop using backslashes here
just exactly use

 TIME_FORMAT = %Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

david_rea
Explorer

Thanks again rschoensee

I've tried the following combinations in the props.conf:
Props.conf entries
TIME_FORMAT="%Y-%m-%dT00:00:00%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d"T00:00:00"%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d""T00:00:00""%n%H:%M:%S”
TIME_FORMAT=%Y-%m-%d"T00:00:00"%n%H:%M:%S
TIME_FORMAT="%Y-%m-%d"T00:00:00%n"%H:%M:%S"

All modifications have resulted in the same message when looking at the data input preview

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

Can you please tell me where exactly the double quotes should be positioned? Seems like a silly question but my searches through other Splunk answers have not help me find this. Eagerly looking to resolving this & appreciate your help, thanks

0 Karma

david_rea
Explorer

BTW, I used the code sample option to show double quotes

0 Karma

david_rea
Explorer

FYI on other props.conf syntax attempted with TIME_FORMAT:
Syntax

TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
0 Karma
Get Updates on the Splunk Community!

New Cloud Intrusion Detection System Add-on for Splunk

In July 2022 Splunk released the Cloud IDS add-on which expanded Splunk capabilities in security and data ...

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...