Hi Team
Trying to ingest an xml file in the following raw format(extracted portion for sample but each event consists of much more values)
<response><row><row _id="1767282" _uuid="0D981036-9B9C-4841-969E-1DC5755039CC" _position="1767282" _address="http://data.montgomerycountymd.gov/resource/_4mse-ku6q/1767282"><date_of_stop>2015-08-08T00:00:00</date_of_stop><time_of_stop>23:58:00</time_of_stop><agency>MCP</agency>
I have line_breaked based on LINE_BREAKER row/s & used the TIME_PREFIX to date_of_stop. The events are broken nicely but the time value is picking up the value immediately following the date value which does not reflect the desired time_of_stop.
The following is the specified props.conf:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]
KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=\<date_of_stop\>
MAX_TIMESTAMP_LOOKAHEAD=85
TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S
When I run the preview of the data input however, I get the following results:
I also tried to utilise the datetime.xml approach with a dummy mydatetime.xml & added the following lines based on reading other answers:
<?xml version="1.0" encoding="UTF-8"?>
<mydatetime>
<define name="Date" extract="year, month, day">
<text>date_of_stop>(\d{4})-(\d{2})-(\d{2})</text>
</define>
<define name="Time" extract="hour, minute, second">
<text>time_of_stop>(\d{2}):(\d{2}):(\d{2})</text>
</define>
<timePatterns>
<use name="Date"/>
<use name="Time"/>
</timePatterns>
<datePatterns>
<use name="Date"/>
<use name="Time"/>
</datePatterns>
</mydatetime>
And updated the props.conf accordingly:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]
KV_MODE=xml
DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
# TIME_PREFIX=\<date_of_stop\>
# MAX_TIMESTAMP_LOOKAHEAD=85
# TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S
But then receive the following when viewing a preview of the data set (after a reboot of Splunk services)
I've reviewed all the splunk answers to questions related & applied a number of other variations but have not had a successful result where Splunk bypasses the time automatically detected starlight after the date_of_stop value. I'm not convinced either that the datetime.xml is the right approach but it is where the documentation has led me to this point.
This worked fine for me
KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56
Thanks rschoensee! Removing Regex switches from the TIME_FORMAT resulted in getting the timestamps across multiple lines.
Props.conf
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S
This worked fine for me
KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=56
Thanks Somesoni2 but tried this & did not resolve the required time to associate with the event. Update the props.conf to reflect your suggestions as per:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]
KV_MODE=xml
# DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=date_of_stop\>
# TIME_FORMAT=%Y-%m-%dT\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S
# TIME_FORMAT=%Y-%m-%dT00:00:00\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop><time_of_stop>%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=57
But resulting preview when uploading the data after a restart of Splunk shows the same result as defined advice prior to introducing the datetime.xml. The message when looking at event alert is:
Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"
I also tried with the newline option but same result
syntax
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>\n<time_of_stop>%H:%M:%S
Just use the strptime Format %n instead of the regex \n in the TIME_FORMAT String.
TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S
I also tried with tags included as per your suggestion
Props.conf
TIME_FORMAT="%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S"
TIME_FORMAT="%Y-%m-%dT00:00:00"</date_of_stop>%n<time_of_stop>"%H:%M:%S"
TIME_FORMAT="%Y-%m-%d"T00:00:00</date_of_stop>%n<time_of_stop>"%H:%M:%S"
TIME_FORMAT=%Y-%m-%dT00:00:00"</date_of_stop>"%n"<time_of_stop>"%H:%M:%S
TIME_FORMAT=%Y-%m-%d"T00:00:00</date_of_stop>"%n"<time_of_stop>"%H:%M:%S
Still the same strptime message.
Try it without using any double quote
Bingo! Thanks rschoensee!
I used the following syntax:
props.conf
TIME_FORMAT=%Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S
This has provided the timestamp capture stated at the later point.
Thanks rschoensee, unfortunately getting the same result. I've tried the following entries in props.conf:
TIME_FORMAT=%Y-%m-%dT00:00:00%n%H:%M:%S
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n<time_of_stop>%H:%M:%S
In both cases, I still receive the stated message:
Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"
Hi David,
in your
TIME_FORMAT=%Y-%m-%dT00:00:00<\/date_of_stop>%n\ %H:%M:%S
the "" part is missing - and please stop using backslashes here
just exactly use
TIME_FORMAT = %Y-%m-%dT00:00:00</date_of_stop>%n<time_of_stop>%H:%M:%S
Thanks again rschoensee
I've tried the following combinations in the props.conf:
Props.conf entries
TIME_FORMAT="%Y-%m-%dT00:00:00%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d"T00:00:00"%n%H:%M:%S”
TIME_FORMAT="%Y-%m-%d""T00:00:00""%n%H:%M:%S”
TIME_FORMAT=%Y-%m-%d"T00:00:00"%n%H:%M:%S
TIME_FORMAT="%Y-%m-%d"T00:00:00%n"%H:%M:%S"
All modifications have resulted in the same message when looking at the data input preview
Could not use strptime to parse timestamp from "2015-08-08T00:00:00\n21:56:00"
Can you please tell me where exactly the double quotes should be positioned? Seems like a silly question but my searches through other Splunk answers have not help me find this. Eagerly looking to resolving this & appreciate your help, thanks
BTW, I used the code sample option to show double quotes
FYI on other props.conf syntax attempted with TIME_FORMAT:
Syntax
TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\<time_of_stop\>%H:%M:%S
TIME_FORMAT=%Y-%m-%d T\d+:\d+:\d+\<\/date_of_stop\>\n\<time_of_stop\>%H:%M:%S