Getting Data In

How to set date & time stamps across two lines in xml where time was already picked up

Explorer

Hi Team

Trying to ingest an xml file in the following raw format(extracted portion for sample but each event consists of much more values)

<response><row><row _id="1767282" _uuid="0D981036-9B9C-4841-969E-1DC5755039CC" _position="1767282" _address="http://data.montgomerycountymd.gov/resource/_4mse-ku6q/1767282"><date_of_stop>2015-08-08T00:00:00</date_of_stop><time_of_stop>23:58:00</time_of_stop><agency>MCP</agency>

I have linebreaked based on LINEBREAKER row/s & used the TIMEPREFIX to dateofstop. The events are broken nicely but the time value is picking up the value immediately following the date value which does not reflect the desired timeof_stop.

The following is the specified props.conf:

[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=\<date_of_stop\>
MAX_TIMESTAMP_LOOKAHEAD=85
TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

When I run the preview of the data input however, I get the following results:
alt text

I also tried to utilise the datetime.xml approach with a dummy mydatetime.xml & added the following lines based on reading other answers:
<?xml version="1.0" encoding="UTF-8"?>

 <mydatetime>
 <define name="Date" extract="year, month, day">
     <text>date_of_stop>(\d{4})-(\d{2})-(\d{2})</text>
 </define>

 <define name="Time" extract="hour, minute, second">
     <text>time_of_stop>(\d{2}):(\d{2}):(\d{2})</text>
 </define>

 <timePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </timePatterns>

 <datePatterns>
     <use name="Date"/>
     <use name="Time"/>
 </datePatterns>
 </mydatetime>

And updated the props.conf accordingly:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
# TIME_PREFIX=\<date_of_stop\>
# MAX_TIMESTAMP_LOOKAHEAD=85
# TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S

But then receive the following when viewing a preview of the data set (after a reboot of Splunk services)
alt text

I've reviewed all the splunk answers to questions related & applied a number of other variations but have not had a successful result where Splunk bypasses the time automatically detected starlight after the dateofstop value. I'm not convinced either that the datetime.xml is the right approach but it is where the documentation has led me to this point.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

View solution in original post

Explorer

Thanks rschoensee! Removing Regex switches from the TIME_FORMAT resulted in getting the timestamps across multiple lines.

Props.conf
 TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

SplunkTrust
SplunkTrust

This worked fine for me

KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56

View solution in original post

Explorer

Thanks Somesoni2 but tried this & did not resolve the required time to associate with the event. Update the props.conf to reflect your suggestions as per:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]

KV_MODE=xml
# DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
# TIME_FORMAT=%Y-%m-%dT\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
# TIME_FORMAT=%Y-%m-%dT00:00:00\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=57

But resulting preview when uploading the data after a restart of Splunk shows the same result as defined advice prior to introducing the datetime.xml. The message when looking at event alert is:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

Explorer

I also tried with the newline option but same result
syntax

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>\n<time_of_stop>%H:%M:%S
0 Karma

Explorer

Just use the strptime Format %n instead of the regex \n in the TIME_FORMAT String.

TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S

Explorer

I also tried with tags included as per your suggestion
Props.conf

TIME_FORMAT="%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S"

TIME_FORMAT="%Y-%m-%dT00:00:00"</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT="%Y-%m-%d"T00:00:00</date_of_stop>%n<time_of_stop>"%H:%M:%S"

TIME_FORMAT=%Y-%m-%dT00:00:00"</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

TIME_FORMAT=%Y-%m-%d"T00:00:00</date_of_stop>"%n"<time_of_stop>"%H:%M:%S

Still the same strptime message.

0 Karma

Explorer

Try it without using any double quote

Explorer

Bingo! Thanks rschoensee!

I used the following syntax:
props.conf

TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

This has provided the timestamp capture stated at the later point.

0 Karma

Explorer

Thanks rschoensee, unfortunately getting the same result. I've tried the following entries in props.conf:
TIMEFORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S
TIME
FORMAT=%Y-%m-%dT00:00:00<\/dateofstop>%n<timeofstop>%H:%M:%S

In both cases, I still receive the stated message:

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

0 Karma

Explorer

Hi David,
in your
TIMEFORMAT=%Y-%m-%dT00:00:00<\/dateof_stop>%n\ %H:%M:%S
the "" part is missing - and please stop using backslashes here
just exactly use

 TIME_FORMAT = %Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S

Explorer

Thanks again rschoensee

I've tried the following combinations in the props.conf:
Props.conf entries
TIMEFORMAT="%Y-%m-%dT00:00:00%n%H:%M:%S”
TIME
FORMAT="%Y-%m-%d"T00:00:00"%n%H:%M:%S”
TIMEFORMAT="%Y-%m-%d""T00:00:00""%n%H:%M:%S”
TIME
FORMAT=%Y-%m-%d"T00:00:00"%n%H:%M:%S
TIME_FORMAT="%Y-%m-%d"T00:00:00%n"%H:%M:%S"

All modifications have resulted in the same message when looking at the data input preview

Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"

Can you please tell me where exactly the double quotes should be positioned? Seems like a silly question but my searches through other Splunk answers have not help me find this. Eagerly looking to resolving this & appreciate your help, thanks

0 Karma

Explorer

BTW, I used the code sample option to show double quotes

0 Karma

Explorer

FYI on other props.conf syntax attempted with TIME_FORMAT:
Syntax

TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
0 Karma