I have an XML file with multiple tags I want to break on. Not all tags should cause a break, but only a subset.
E.g.
< Security> ... < /Security> should be an event
< Admin> ... < /Admin> should be another event
< Order> ... < /Order> should be another event
I tried to break on a regex, but it did not work:
"< Security>" | "< Admin>" | "< Order>" | "< Payment>"
A bonus would be if the header was disregarded/not listed.
I found it:
You have to enter this string in the Regex-field of Data preview (please remove the blanks after the < sign, I added them only because otherwise this forum would not accept it)
(?m)^(< Admin>)|(< Order>)|(< Security>)|(< Payment>)
It says:
(?m) ...go for multiline and do not stop at the first event you find
^...the search term is at the beginning of the line
()...a grouped search term
< Admin>...(e.g.) search the exact phrase, case-sensitive
|...logical OR statement
or directly in the props.conf file:
[NameOfTheSourcetype]
BREAK_ONLY_BEFORE = (?m)^(< Admin>)|(< Order>)|(< Security>)|(< Payment>)
NO_BINARY_CHECK = 1
TIME_PREFIX = < Date_and_time>
pulldown_type = 1
The TIME_PREFIX was added by me, because my timestamp was tagged this way. You can leave it out, because your files will probably be tagged differently.
I found it:
You have to enter this string in the Regex-field of Data preview (please remove the blanks after the < sign, I added them only because otherwise this forum would not accept it)
(?m)^(< Admin>)|(< Order>)|(< Security>)|(< Payment>)
It says:
(?m) ...go for multiline and do not stop at the first event you find
^...the search term is at the beginning of the line
()...a grouped search term
< Admin>...(e.g.) search the exact phrase, case-sensitive
|...logical OR statement
or directly in the props.conf file:
[NameOfTheSourcetype]
BREAK_ONLY_BEFORE = (?m)^(< Admin>)|(< Order>)|(< Security>)|(< Payment>)
NO_BINARY_CHECK = 1
TIME_PREFIX = < Date_and_time>
pulldown_type = 1
The TIME_PREFIX was added by me, because my timestamp was tagged this way. You can leave it out, because your files will probably be tagged differently.
Hi,
Here in this answer you have mentioned "^...the search term is at the beginning of the line".
Is it really necessary to have that field in the start.
In my case it's without any spaces or new line.
`< ?xml version="1.0" encoding="UTF-8"?>< Content>< Admin>< Disregard_1>[]< /Disregard_1>< Date_and_time>Mon Jan 13 22:44:53 MET 2014< /Date_and_time>< Domain>01< /Domain>< Disregard_4>18512< /Disregard_4>< Machine_name>Server1< /Machine_name>< Usecase>12< /Usecase>
< /Admin>< Order>< Disregard_1>[---]< /Disregard_1>< Date_and_time>Wed Jan 15 11:19:25 MET 2014< /Date_and_time>< Domain>02< /Domain>< Machine_name>Server2< /Machine_name>< Usecase>06< /Usecase>< Actor>< Type_of_actor>USER< /Type_of_actor>< /Actor>< /Order>< Order>< Disregard_1>[---]< /Disregard_1< Date_and_time>Thu Jan 16 12:18:03 MET 2014< /Date_and_time>< Domain>02< /Domain>< Machine_name>Server2< /Machine_name>< Usecase>06< /Usecase>< /Order>< Alerting>< Disregard_1>ab< /Disregard_1>< Date_and_time>Tue Jan 14 09:56:37 MET 2014< /Date_and_time>< Machine_name>Server3< /Machine_name>< Usecase>01< /Usecase>< /Alerting>< /Content>
So will it work?
@nasrinmulani This thread is nearly 4 years old with an accepted answer so you're unlikely to get many responses. I suggest you post a new question describing your problem. Reference this answer if you wish.
Can you put your props and transforms configuration? Where are you placing your regex?
Here is a typical sample of this file (with adapted XML-Tags
< ?xml version="1.0" encoding="UTF-8"?>
< Content>
< Admin>
< Disregard_1>[]< /Disregard_1>
< Date_and_time>Mon Jan 13 22:44:53 MET 2014< /Date_and_time>
< Domain>01< /Domain>
< Disregard_4>18512< /Disregard_4>
< Machine_name>Server1< /Machine_name>
< Usecase>12< /Usecase>
< /Admin>
< Order>
< Disregard_1>[---]< /Disregard_1>
< Date_and_time>Wed Jan 15 11:19:25 MET 2014< /Date_and_time>
< Domain>02< /Domain>
< Machine_name>Server2< /Machine_name>
< Usecase>06< /Usecase>
< Actor>
< Type_of_actor>USER< /Type_of_actor>
< /Actor>
< /Order>
< Order>
< Disregard_1>[---]< /Disregard_1>
< Date_and_time>Thu Jan 16 12:18:03 MET 2014< /Date_and_time>
< Domain>02< /Domain>
< Machine_name>Server2< /Machine_name>
< Usecase>06< /Usecase>
< /Order>
< Alerting>
< Disregard_1>ab< /Disregard_1>
< Date_and_time>Tue Jan 14 09:56:37 MET 2014< /Date_and_time>
< Machine_name>Server3< /Machine_name>
< Usecase>01< /Usecase>
< /Alerting>
< /Content>
I am putting the regex in Add data » Files & directories » Data preview in the field Specify a pattern or regex to break before ex: \d+foo\d[2,4], Start Of Event, ^***
the props and transforms are untouched, yet.