i've been looking around a bit but it appears my google-fu isnt up to snuff for this problem.
i'm wondering how one can parse non-pure xml logs. as in we have a ".txt" file that timestamps, debug messages and all that jazz. but it also writes entire XML responses in the log file. i dont know if the source log is able to separate it out (doubt it) but the question goes as follows.
i'm aware there is a XML parser for the props.conf. but does that work on intermittent xml formatted log entries?
<date>,source,debug: event happened
<date>,source,debug: function executed
<date>,source,debug: send data to URI
<date>,source,debug: multiline XML response
the xml response can sometimes be longer than 1000 lines which is where the issue arrives, as we get an error due to the log entry being more than 1000 lines long. it's all the xml tags <meta> </meta> <link> </link> and the like.
does props support something in the line of
XML do XML parsing
do normal props things
or do I have to parse ell the XML response segments manually?
I don't know if this is feasible for you but you should be able to rename a sourcetype based on regex (as described here https://community.splunk.com/t5/Splunk-Search/Changing-sourcetype-with-regex/m-p/155058#M43613).
With this you could use the default xml sourcetype and the original sourcetype with the rest of your data.
I am not sure about the long xml thought.