Hi,
i am trying to ingest XML files and split the elements in fields, my log files are;
<?xml version="1.0" encoding="UTF-8" standalone="no"?><SmartPanel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" DocumentCreationDate="2019-07-09T10:18:04" DocumentVersion="5" PanID="15" LogCreationDate="2019-07-08T18:45:32" TvID="0" xmlns="urn:nds:dyn:pms:Smart:v1" xsi:schemaLocation="urn:nds:dyn:pms:Smart:v1 /apps/WEB-INF/amsXmlSchema.xsd"><Subscriber SubscriberID="126" DeviceID="2915"><SmartNoSubstitution EventTime="2019-07-08T18:45:53"><availId>175696022</availId><reason>0</reason><ServiceKey>4049</ServiceKey></SmartNoSubstitution><SmartNoSubstitution EventTime="2019-07-08T18:57:05"><availId>175696024</availId><reason>0</reason><ServKey>4049</ServKey></SmartNoSubstitution></Subscriber></SmartPanel>
and
<?xml version="1.0" encoding="UTF-8" standalone="no"?><SmartPanel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" DocumentCreationDate="2019-07-09T11:18:04" DocumentVersion="5" PanID="5" LogCreationDate="2019-07-08T19:45:32" TvID="0" xmlns="urn:nds:dyn:pms:Smart:v1" xsi:schemaLocation="urn:nds:dyn:pms:Smart:v1 /apps/WEB-INF/amsXmlSchema.xsd"><Subscriber SubscriberID="178" DeviceID="45615"></Subscriber></SmartPanel>
from other questions my props.conf and transform.conf are below
props.conf
[pms]
TIME_PREFIX=EventTime
TIME_FORMAT=%Y-%m-%dT%H:%M:%S
SHOULD_LINEMERGE=false
TRUNCATE=100000
LINE_BREAKER=\>\s*(?=\)
REPORT-xmlext=xml-extr
and
transforms.conf
[xml-extr]
REGEX=<([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT=$1::$2
MV_ADD=true
REPEAT_MATCH=true
however the only files being ingested are the second one and this is giving fields where there is an =
i have tried to use KV_MODE=xml but this has not helped.
i have used regex101 to validate the regex
Match 1
Full match 451-479 <availId>175696022</availId>
Group 1. 452-459 availId
Group 2. 460-469 175696022
Match 2
Full match 479-497 <reason>0</reason>
Group 1. 480-486 reason
Group 2. 487-488 0
Match 3
Full match 497-526 <ServiceKey>4049</ServiceKey>
Group 1. 498-508 ServiceKey
Group 2. 509-513 4049
does any body have any advice?
Your question is very unclear. The settings that you have will work correctly for the first case and KV_MODE=auto
will work for the 2nd case. So what EXACTLY is your problem here? As far as LINE_BREAKER
, we cannot help you unless you show us multiple events exactly the way that they are in the file (with all variations).
Few comments:
TIME_PREFIX=EventTime="
(probably also works with just TIME_PREFIX=EventTime
, but better be as specific as possible I would say.<SmartNoSubstitution
?KV_MODE = none
in props.confREPEAT_MATCH=true
, since that setting only applies to index time extractionsHi Frank,
thanks for your comments, i have tried what you suggested however the breaks and field ingest does not work.
i still have fields that are based on elements with a =, but anything after SmartNoSubstitution is not extracting.
What linebreaker are you now using? Because what you have doesn't make much sense to me as I said and I didn't suggest anything else yet.
Then I guess the first thing to do is some troubleshooting to confirm whether Splunk is really using the configuration at all.
Check (e.g. using btool) that the indexers / heavy forwarders have the configuration for the index time things (line breaking, timestamping). Have you restarted them after making the changes? Make sure when testing that you are actually looking at freshly ingested events, otherwise you're not going to see the effect of any changes to index time config.
Check the Search Heads have the field extraction config (e.g. confirm it is present from the GUI Settings -> Fields and has appropriate permission and sharing settings to make the config available in the app where you run the search).