All of our data is in XML format that is being indexed. I've been able to pull out a lot of extractions for single value attributes or element values.
However I've yet to be able to figure out how to deal with multivalue xpaths. I've tried using xmlkv but there is little to no documentation in the Splunk documentation and answers.splunk.com.
IE:
<a>1</a><a>2</a><a>3</a>
I want a field or fields pulled out with a certain name for the values 1,2 and 3. xmlkv doesn't seem to do anything. How can I get the values 1,2 and 3 pulled out into a field?
xmlkv
looks for values inside tags of the form <fieldname>value</fieldname>
. Each time it finds that pattern it sets an extracted field to that value.
If there are multiple keys with the same name, the value of the last one will be used. Note that value
may be an empty string, in which case the field will be set to null and may not appear in the list.
For your example, you should see an additional field in the field picker named a
, containing a value of 3
.
xmlkv
is actually a python-based command in the search app, so you can look at the source code in apps/search/bin/xmlkv.py
if you're so inclined.
To add to what sotheringtop said, you can aways make a copy of the "xmlkv.py" script and make it multi-value aware so that "a" is returned as a muli-value field that contains [ 1, 2, 3 ].
But on second thought, it may be easier to just do this with a transformer like so:
[xmlkv_multivalue]
REGEX = <(.*?)(?:\s[^>]*)?>([^<]*)</\\1>
FORMAT = $1::$2
MV_ADD = true
So instead of adding | xmlkv
to you search, add | extract xmlkv_multivalue
and see if that gets you what you want.
|spath handles both JSON and XML.
|makeresults |eval _raw="<a>1</a><a>2</a><a>3</a>" |spath path=a output=a
...or just
|makeresults |eval _raw="<a>1</a><a>2</a><a>3</a>" |spath
To add to what sotheringtop said, you can aways make a copy of the "xmlkv.py" script and make it multi-value aware so that "a" is returned as a muli-value field that contains [ 1, 2, 3 ].
But on second thought, it may be easier to just do this with a transformer like so:
[xmlkv_multivalue]
REGEX = <(.*?)(?:\s[^>]*)?>([^<]*)</\\1>
FORMAT = $1::$2
MV_ADD = true
So instead of adding | xmlkv
to you search, add | extract xmlkv_multivalue
and see if that gets you what you want.
Agreed - this is a better way.
Thank you for responding so quickly and with fantastic descriptions.
At this point I've extracted the parent element of the "a" elements and do searches with preceding and trailing *.
xmlkv
looks for values inside tags of the form <fieldname>value</fieldname>
. Each time it finds that pattern it sets an extracted field to that value.
If there are multiple keys with the same name, the value of the last one will be used. Note that value
may be an empty string, in which case the field will be set to null and may not appear in the list.
For your example, you should see an additional field in the field picker named a
, containing a value of 3
.
xmlkv
is actually a python-based command in the search app, so you can look at the source code in apps/search/bin/xmlkv.py
if you're so inclined.