Hi Team
Trying to ingest an xml file in the following raw format(extracted portion for sample but each event consists of much more values)
<response><row><row _id="1767282" _uuid="0D981036-9B9C-4841-969E-1DC5755039CC" _position="1767282" _address="http://data.montgomerycountymd.gov/resource/_4mse-ku6q/1767282"><date_of_stop>2015-08-08T00:00:00</date_of_stop><time_of_stop>23:58:00</time_of_stop><agency>MCP</agency>
I have line_breaked based on LINE_BREAKER row/s & used the TIME_PREFIX to date_of_stop. The events are broken nicely but the time value is picking up the value immediately following the date value which does not reflect the desired time_of_stop.
The following is the specified props.conf:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]
KV_MODE=xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
TIME_PREFIX=\<date_of_stop\>
MAX_TIMESTAMP_LOOKAHEAD=85
TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S
When I run the preview of the data input however, I get the following results:
I also tried to utilise the datetime.xml approach with a dummy mydatetime.xml & added the following lines based on reading other answers:
<?xml version="1.0" encoding="UTF-8"?>
<mydatetime>
<define name="Date" extract="year, month, day">
<text>date_of_stop>(\d{4})-(\d{2})-(\d{2})</text>
</define>
<define name="Time" extract="hour, minute, second">
<text>time_of_stop>(\d{2}):(\d{2}):(\d{2})</text>
</define>
<timePatterns>
<use name="Date"/>
<use name="Time"/>
</timePatterns>
<datePatterns>
<use name="Date"/>
<use name="Time"/>
</datePatterns>
</mydatetime>
And updated the props.conf accordingly:
[source::/Users/daithi/Dataset_upload/montgomery-traffic-0809-sample.xml]
KV_MODE=xml
DATETIME_CONFIG=/Applications/Splunk/etc/mydatetime.xml
LINE_BREAKER=([\r\n]*)(?=\<row\s|\<\w+)
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=([\r\n]*)(?=\<row\s)
# TIME_PREFIX=\<date_of_stop\>
# MAX_TIMESTAMP_LOOKAHEAD=85
# TIME_FORMAT=%Y-%m-%d\w\d\d:\d\d:\d\d\<\/date_of_stop\>%n\<time_of_stop\>%H:%M:%S
But then receive the following when viewing a preview of the data set (after a reboot of Splunk services)
I've reviewed all the splunk answers to questions related & applied a number of other variations but have not had a successful result where Splunk bypasses the time automatically detected starlight after the date_of_stop value. I'm not convinced either that the datetime.xml is the right approach but it is where the documentation has led me to this point.
... View more