<?xml version="1.0" encoding="UTF-8" ?><dataroot><Interceptor><AttackCoords>-80.33xxxxxxx22947</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>23</Infiltrators><Enforcer>Ironwood</Enforcer><ActionDate>2013-04-24</ActionDate><ActionTime>00:0xx:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-80.2xxxxxxxxxxxx475695</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor>
hi everyone
I have one unstructured xml file. the event supposed to be start from "<Interceptor>"
while uploading the file, I select BREAK_ONLY_BEFORE= but seems that changes are not reflecting
[xyz ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%d
TIME_PREFIX=<ActionDate>
MAX_TIMESTAMP_LOOKAHEAD=100
BREAK_ONLY_BEFORE=<Interceptor>
I was having the same problem, and the LINE_BREAKER solution posted above appears to solve my problem, but I wanted to understand why BREAK_ONLY_BEFORE didn't work. I made it work by inserting a newline into the XML. So riqbal's data would look like this:
<?xml version="1.0" encoding="UTF-8" ?><dataroot>
<Interceptor><AttackCoords>-80.33xxxxxxx22947</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>23</Infiltrators><Enforcer>Ironwood</Enforcer><ActionDate>2013-04-24</ActionDate><ActionTime>00:0xx:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-80.2xxxxxxxxxxxx475695</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor>
Then I got exactly the result I wanted (events being identified by a specific XML tag).
This result makes sense, because the functional description for BREAK_ONLY_BEFORE is "When set, Splunk software creates a new event only if it encounters a new line that matches the regular expression." (Emphasis added.) In riqbal's data, Splunk won't find "a new line" with <Interceptor>
in it, unless he has multi-line data and the string appears again in another line. Additionally, anything that is also in the line that is before the tag (e.g., closing tags for the previous event, which is my situation) will also be part of the next event.
LINE_BREAKER seems to be the better solution, however, since no editing of the XML is needed. FYI, this line (edited to match the preceding example) in props.conf work for me:
LINEBREAKER=([\r\n]*)\<Interceptor
@riqbal converted your comment to answer. Please accept to mark as answered. Do up vote the other answer that has helped.
Hi,
It's regex format, try this :
BREAK_ONLY_BEFORE = \<Interceptor\>
3no
not working, the xml file is without breaks.