Splunk Search

mvzip multivalue pairs with different cardinality

badrinath_itrs
Communicator

Hi,

I am struggling with xml data in splunk and need help in mvzip command to store multi value pairs with different cardinality and expand them .

I have a xml which looks as below in tabular format .

Order , ProductName , Appointment ID
1-23445565, xyz , 12345
1-23456, abcdef, 1233434
2-1223243, cdefg ,
5-123456, dfyg, 122343

I am trying to extract all the multivalue pair values and storing them using mvzip command, but when there is a null value it does not handle it well. I also tried with fillnull command which does not seem to work pretty well here .

My search looks like this at this moment.

search query | | fillnull value=None AppointmentID | eval field1=mvzip(ProductName,AppointmentID )
| rex field=fields1 "(?<ProductName>[^,]+),(?<AppointmentID >[^,]+)"
| table host, source, OrderNumber,ProductName,AppointmentID

Am I doing something wrong here. Any help here is much appreciated.

somesoni2
Revered Legend

Give this a try. Splunk while extracting multivalued field will ignore the null values, and result of that will be count of elements in field with null values (AppointmenID here) will be less than the field (Product here) without null value. The mvzip will do the wrong assignment and will give lesser result. What I'm doing here is adding a default AppointID tag if not present after Product tag (line 3).

index="*" 
 |rex "<Order>(?P<OrderId>[^\<]+)"
 | rex mode=sed "s/(\<\/Product\>)\s*(\<SubOrder\>)/\1<AppointmentID>NA<\/AppointmentID>\2/g"
 |rex max_match=0 "<Product>(?P<tmpProduct>[^\<]+)"
 |rex max_match=0 "<AppointmentID>(?P<tmpAppointmentID>[^\<]+)"
 |eval field1=mvzip(tmpProduct,tmpAppointmentID)
 | mvexpand field1
 | rex field=field1 "(?<ProductName>[^,]+),(?<AppointmentID>[^,]+)"
 | table OrderId,field1, ProductName,AppointmentID

badrinath_itrs
Communicator

Hi somesoni2,

I appreciate your help here.

I just pasted one example here , there are several fields and the data in raw xml could go any further level, hence managing with sed will be difficult .

Regards,
Badri

0 Karma

woodcock
Esteemed Legend

I think the problem is that your CSV has Appointment ID but your search is using AppointmentID. I was able to do what you are saying exactly the way you are trying to do it and it works. Check out this run-anywhere example and peel back the last commands one-by-one to see that/how it works:

|noop|stats count as order | eval order="a,b,c" | makemv delim="," order | mvexpand order | eval ProductName=case(order="a",1,order="b",null(),order="c",3) | eval AppointmentID=case(order="a","A",order="b","B",order="c","C") | fillnull value="None" ProductName AppointmentID | eval kvp=mvzip(ProductName,AppointmentID)

jacobwilkins
Communicator

Give us a better example of your sample data. You aren't clearly showing any mv fields in that example.

0 Karma

badrinath_itrs
Communicator

Hi,

I am struggling with xml data in splunk and need help in mvzip /mvexpand command to store multi value pairs with different cardinality and expand them as value pair correctly.

I have a xml which looks as below .

<xml>
<Order>1-23445565 </Order>
<SubOrder></SubOrder>
<Product>Phone</Product>
<AppointmentID>1234</AppointmentID>
<SubOrder></SubOrder>
<Product>Mobile</Product>
<AppointmentID>12345</AppointmentID>
<SubOrder></SubOrder>
<Product>TV</Product>
<SubOrder></SubOrder>
<Product>Internet</Product>
<AppointmentID>123456</AppointmentID>
<xml>

I am trying to extract all the multivalue pair values ( Product, AppointmentID) and storing them using mvzip command, but when there is a null value it does not handle it well. I also tried with fillnull command which does not seem to work pretty well here .

My search looks like this at this moment to extract the key value pairs.

index="*" 
|rex "<Order>(?P<OrderId>[^\<]+)"
|rex max_match=0 "<Product>(?P<tmpProduct>[^\<]+)"
|rex max_match=0 "<AppointmentID>(?P<tmpAppointmentID>[^\<]+)"
|eval field1=mvzip(tmpProduct,tmpAppointmentID)
| mvexpand field1
| rex field=field1 "(?<ProductName>[^,]+),(?<AppointmentID>[^,]+)"
| table OrderId,field1, ProductName,AppointmentID

In output I am getting below results which is incorrect.

OrderId field1  ProductName AppointmentID
1-23445565  Phone,1234  Phone   1234
1-23445565  Mobile,12345    Mobile  12345
1-23445565  TV,123456   TV  123456

I am missing one record with Product called "Internet" and the AppointmentID is wrongly getting appended to Product "TV" as the same product does not have any AppointmentID in the raw xml data.

Ideally My ouput should look like as below.

OrderId field1  ProductName AppointmentID
1-23445565  Phone,1234  Phone   1234
1-23445565  Mobile,12345    Mobile  12345
1-23445565  TV,"N/A"        TV   "N/A"
1-23445565  Internet,        123456

Any help here is much appreciated.

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...