I have several raw xml events that are getting indexed from a monitored log. The log is forwarded from a universal forwarder. The xml event is pretty long and I only want to index certain fields from it. How can I do this before indexing so that I can save my daily volume limit which is starting to exceed the limit.
Best practice is to do this with "something else" to keep the Indexers from being overloaded with work that "other things" can do, to keep them free to do the stuff that "only Indexers can do". You might do this with a heavy forwarder, but I would not. I would write my own pre-parser code to strip out the stuff that I need and write it to a different file in a special directory for only these files and then have your Splunk UF monitor that other directory of pre-processed files.
@riotto, you can define props.conf to index only selected part of XML. Try something like the following:
BREAK_ONLY_BEFORE=\<yourRequiredXMLNode\>
MUST_BREAK_AFTER=\<\/yourRequiredXMLNode\>
If the part of XML you want to index also has timestamp field you would need to define TIME_PREFIX and TIME_FORMAT as well.
If it does not work you might have to provide us with sample XML data after mocking/anonymizing any sensitive data and also what props.conf you have for your sourcetype.