i am trying to ingest XML files and split the elements in fields, my log files are;
<?xml version="1.0" encoding="UTF-8" standalone="no"?><SmartPanel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" DocumentCreationDate="2019-07-09T10:18:04" DocumentVersion="5" PanID="15" LogCreationDate="2019-07-08T18:45:32" TvID="0" xmlns="urn:nds:dyn:pms:Smart:v1" xsi:schemaLocation="urn:nds:dyn:pms:Smart:v1 /apps/WEB-INF/amsXmlSchema.xsd"><Subscriber SubscriberID="126" DeviceID="2915"><SmartNoSubstitution EventTime="2019-07-08T18:45:53"><availId>175696022</availId><reason>0</reason><ServiceKey>4049</ServiceKey></SmartNoSubstitution><SmartNoSubstitution EventTime="2019-07-08T18:57:05"><availId>175696024</availId><reason>0</reason><ServKey>4049</ServKey></SmartNoSubstitution></Subscriber></SmartPanel>
<?xml version="1.0" encoding="UTF-8" standalone="no"?><SmartPanel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" DocumentCreationDate="2019-07-09T11:18:04" DocumentVersion="5" PanID="5" LogCreationDate="2019-07-08T19:45:32" TvID="0" xmlns="urn:nds:dyn:pms:Smart:v1" xsi:schemaLocation="urn:nds:dyn:pms:Smart:v1 /apps/WEB-INF/amsXmlSchema.xsd"><Subscriber SubscriberID="178" DeviceID="45615"></Subscriber></SmartPanel>
from other questions my props.conf and transform.conf are below
[pms] TIME_PREFIX=EventTime TIME_FORMAT=%Y-%m-%dT%H:%M:%S SHOULD_LINEMERGE=false TRUNCATE=100000 LINE_BREAKER=\>\s*(?=\) REPORT-xmlext=xml-extr
[xml-extr] REGEX=<([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\> FORMAT=$1::$2 MV_ADD=true REPEAT_MATCH=true
however the only files being ingested are the second one and this is giving fields where there is an =
i have tried to use KV_MODE=xml but this has not helped.
i have used regex101 to validate the regex
Match 1 Full match 451-479 <availId>175696022</availId> Group 1. 452-459 availId Group 2. 460-469 175696022 Match 2 Full match 479-497 <reason>0</reason> Group 1. 480-486 reason Group 2. 487-488 0 Match 3 Full match 497-526 <ServiceKey>4049</ServiceKey> Group 1. 498-508 ServiceKey Group 2. 509-513 4049
does any body have any advice?
Your question is very unclear. The settings that you have will work correctly for the first case and
KV_MODE=auto will work for the 2nd case. So what EXACTLY is your problem here? As far as
LINE_BREAKER, we cannot help you unless you show us multiple events exactly the way that they are in the file (with all variations).
TIME_PREFIX=EventTime="(probably also works with just
TIME_PREFIX=EventTime, but better be as specific as possible I would say.
KV_MODE = nonein props.conf
REPEAT_MATCH=true, since that setting only applies to index time extractions
thanks for your comments, i have tried what you suggested however the breaks and field ingest does not work.
i still have fields that are based on elements with a =, but anything after SmartNoSubstitution is not extracting.
What linebreaker are you now using? Because what you have doesn't make much sense to me as I said and I didn't suggest anything else yet.
Then I guess the first thing to do is some troubleshooting to confirm whether Splunk is really using the configuration at all.
Check (e.g. using btool) that the indexers / heavy forwarders have the configuration for the index time things (line breaking, timestamping). Have you restarted them after making the changes? Make sure when testing that you are actually looking at freshly ingested events, otherwise you're not going to see the effect of any changes to index time config.
Check the Search Heads have the field extraction config (e.g. confirm it is present from the GUI Settings -> Fields and has appropriate permission and sharing settings to make the config available in the app where you run the search).