Hey there!
I'm trying to monitor(batch)) a folder congaing xml files,
the XML files don't necessarily have the same structure, also they have multiple hierarchy and the level of it might vary .
where and how do i configure a sourcetype the know's how to handle this kind of a case so i won't have to parse the data with rex on search time.
example for a file that may exists:
The rex command is very flexible, but there are others you can use. Consider xpath and xmlkv.
In props.conf, consider using KV_MODE = xml to have Splunk automatically extract fields.
Writing regex with Rex field =_raw..
Is a great solution but I would like to do it at source type level so I won't have to write long and complicated querys.
I tried xmlkv and kv_mode = xml both doesn't extract fields with 2 or more levels of hierarchy so I'm missing a lot of fields.
Any more suggestions please?
Have you tried KV_MODE = xml in props.conf?
is there a way to see how it will work before applying it ?
so if it won't work as planned i won't have to delete all the data inserted?
did you happen to check how it works on generic xml files or any sample from some place else?
Regrettably, there is no way to see how fields will be extracted before ingesting data. The Extract FIeld wizard lets you preview extractions, but requires onboarded events.
This is a good use for a test system, even if it's your workstation. Capture some sample data in a file, transfer the file to the test system and experiment with field extractions there. Once you have it working as desired, export the settings in an app for installation in Production. When you're done, just delete the index you used for testing.
If you can't use a test system then you'll have to test in Production. Use a separate index (I call mine "test") until you have the extractions working right. Since you're likely to be using search-time extractions, you should need to ingest the data only once.