Hi I am trying to extract the values of certain field present in Log for a particular operation:
My Query:
Service="X1" Operation="Y1" AND AuditType="REQUEST_OUTBOUND" | sort _time | xmlunescape | xmlkv | fields + A, B
There are multiple values present in the log for fields A & B as below:
Message in Log:
<A>1000</A>
<A>29</A>
<A>30</A>
<B>B1</B>
<B>D2</B>
<B>C3</B>
whereas I need my output as :
A B
1000 B1
29 D2
30 C3
But i am getting output result as :
A B
30 C3
I am getting only the last value of the name-value pair but not all the values.
Could anyone please help in getting the desired output? Please let me know what command should i use and how to modify the search?
This limitation seems to stem from the xmlkv command in the search app. It appears that the python script overwrites the previous value as it enumerates new values for the same key.
Here's the culprit:
for kvpair in XML_KV_RE.findall(rawOut):
r[kvpair[0]] = kvpair[1]
It didn't take much time to create a "hack" to fix this, but it's not an ideal solution. This will work:
In myxml.py, replace the code above with the following, but be very careful to indent the code properly. The indentations don't display properly on this page, so you will have to eyeball them according to the rest of the code in myxml.py. Everything under the "for kvpair" line needs to be indented, and the lines under "if not" and under "else:" both need to be further indented. If you don't indent properly, you'll probably just generate an error in the Splunk UI when you run the command we're building.
for kvpair in XML_KV_RE.findall(rawOut):
if not kvpair[0] in r:
r[kvpair[0]] = kvpair[1]
else:
r[kvpair[0]] = r[kvpair[0]] + ", " + kvpair[1]
Create/edit $SPLUNK_HOME/etc/apps/search/local/commands.conf and add the following:
[myxml]
filename = myxml.py
retainsevents = true
overrides_timeorder = false
Restart Splunk
We have just created a new command, myxml, that you can use in place of xmlkv to collect the multivalue XML fields. The multivalue XML fields will now show up as comma-separated values. For instance, A = "1000, 29, 30" and "B = B1, D2, C3". If you need to split the fields into individual values, you can use makemv.
Here is a sample search:
sourcetype=xmlkv | myxml | makemv delim="," A | makemv delim="," B
When I perform this in the lab, I see A(3) and B(3) in the Field Picker, and I can perform operations on any of the individual values.
HTH
ron
Here is a post of the myxml.py file for the above solution. I wasn't able to get this to format properly in the initial answer:
# Copyright (C) 2005-2010 Splunk Inc. All Rights Reserved. Version 4.0
import sys,splunk.Intersplunk
import re
import urllib
import xml.sax.saxutils as sax
XML_KV_RE = re.compile("<(.?)(?:\s[^>])?>([^<]*)")
try: results,dummyresults,settings = splunk.Intersplunk.getOrganizedResults()
for r in results:
if "_raw" in r:
raw = r["_raw"]
rawOut = sax.unescape( raw )
while( rawOut != raw ):
raw = rawOut
rawOut = sax.unescape( raw )
r["_raw"] = rawOut
for kvpair in XML_KV_RE.findall(rawOut):
if not kvpair[0] in r:
r[kvpair[0]] = kvpair[1]
else:
r[kvpair[0]] = r[kvpair[0]] + ", " + kvpair[1]
except: import traceback stack = traceback.format_exc() results = splunk.Intersplunk.generateErrorResults("Error : Traceback: " + str(stack))
splunk.Intersplunk.outputResults( results )
Hi Ron
It did work, and thanks a lot for your reply.
My sincere apologies for the late response, as i have some n/w connectivity issues and couldn't sign in for a long time.
Once again thanks for your valuable inputs.
Here is a post of the myxml.py file for the above solution. I wasn't able to get this to format properly in the initial answer:
# Copyright (C) 2005-2010 Splunk Inc. All Rights Reserved. Version 4.0
import sys,splunk.Intersplunk
import re
import urllib
import xml.sax.saxutils as sax
XML_KV_RE = re.compile("<(.?)(?:\s[^>])?>([^<]*)")
try: results,dummyresults,settings = splunk.Intersplunk.getOrganizedResults()
for r in results:
if "_raw" in r:
raw = r["_raw"]
rawOut = sax.unescape( raw )
while( rawOut != raw ):
raw = rawOut
rawOut = sax.unescape( raw )
r["_raw"] = rawOut
for kvpair in XML_KV_RE.findall(rawOut):
if not kvpair[0] in r:
r[kvpair[0]] = kvpair[1]
else:
r[kvpair[0]] = r[kvpair[0]] + ", " + kvpair[1]
except: import traceback stack = traceback.format_exc() results = splunk.Intersplunk.generateErrorResults("Error : Traceback: " + str(stack))
splunk.Intersplunk.outputResults( results )
This limitation seems to stem from the xmlkv command in the search app. It appears that the python script overwrites the previous value as it enumerates new values for the same key.
Here's the culprit:
for kvpair in XML_KV_RE.findall(rawOut):
r[kvpair[0]] = kvpair[1]
It didn't take much time to create a "hack" to fix this, but it's not an ideal solution. This will work:
In myxml.py, replace the code above with the following, but be very careful to indent the code properly. The indentations don't display properly on this page, so you will have to eyeball them according to the rest of the code in myxml.py. Everything under the "for kvpair" line needs to be indented, and the lines under "if not" and under "else:" both need to be further indented. If you don't indent properly, you'll probably just generate an error in the Splunk UI when you run the command we're building.
for kvpair in XML_KV_RE.findall(rawOut):
if not kvpair[0] in r:
r[kvpair[0]] = kvpair[1]
else:
r[kvpair[0]] = r[kvpair[0]] + ", " + kvpair[1]
Create/edit $SPLUNK_HOME/etc/apps/search/local/commands.conf and add the following:
[myxml]
filename = myxml.py
retainsevents = true
overrides_timeorder = false
Restart Splunk
We have just created a new command, myxml, that you can use in place of xmlkv to collect the multivalue XML fields. The multivalue XML fields will now show up as comma-separated values. For instance, A = "1000, 29, 30" and "B = B1, D2, C3". If you need to split the fields into individual values, you can use makemv.
Here is a sample search:
sourcetype=xmlkv | myxml | makemv delim="," A | makemv delim="," B
When I perform this in the lab, I see A(3) and B(3) in the Field Picker, and I can perform operations on any of the individual values.
HTH
ron
Tried using "search .... | multikv fields A B" but still not getting any output. Not sure as why. Can someone please help me get this fixed.