Dashboards & Visualizations

Multiple fields with same name in XML data

aoleske
Path Finder

Good Morning all,
I am having an issue with searching some FNXML data with multiple fields with the same name. I am trying to extract all the fields so they show all the entries for troubleshooting purposes. I have tried nomv and mvcombine, but can’t seem to get them to work correctly. I have also tried regex, but it only grabs the first field of each type. It would be nice to use xmlkv to get the fields, but it seems to have a restriction for grabbing only the 1st occurrence of the field. In the example below, I am trying to grab the field_name and context_type fields. To make matters more difficult, the events are radically different from each other and do not necessarily follow with the same number of fields nor are they in the same order. The data does follow the overall format. Our overall visualization is a table that shows all the entries for the desired fields. Below is a snippet of data to illustrate what I am trying to work with. I do have the FNXML results in an XML parser, and if you would like to see how it's laid out, contact me and I will send the output to you. (I was unable to attach it here.)

Action 12/13/2017 2:21:09 PM: Raw XML: DATAMGR48573028475-73c5-4181-bee2-b5f8e710f590123456DOCUMENT #1112345updateupdateDOCUMENT REVISION11K2updateupdateDOCUMENT DESCRIPTION11SOMETHING ASSY - FFFPupdateupdateDOCUMENT #12300651-500updateupdate DOCUMENT DESCRIPTION 14AWESOME ASSY-FFFPupdateupdate

I have tried many searches, which have been long wiped out, but a few of the basic searches I am trying look like this:
host=velocity1 index=velocity sourcetype="Velocity:intercim" "RAW XML" | xmlkv maxinputs=10000 | table user_id,app_session_id,context_type,field_name | mvcombine field_name | mvcombine context_type

I tried rex, just on one field to try and get the data and get no data back:
host=velocity1 index=velocity sourcetype="Velocity:intercim" "RAW XML" | rex "context_type>(?.*?)\" | table context_type

I suspect I will have to use regex and do some kind of global search, but am unclear on how to do that, as the overall format of the sections change and do not always follow the same pattern.

Tags (2)
0 Karma
1 Solution

niketn
Legend

@aoleske, please find below the run anywhere search based on your sample data. You will need to include your existing base search instead of first two pipes i.e. | makeresults and | eval _raw... which cook up the sample data as per the request.

| makeresults
| eval _raw=" Action 12/13/2017 2:21:09 PM: Raw XML: <FNXML><Session><user_id>DATAMGR</user_id><app_session_id>48573028475-73c5-4181-bee2-b5f8e710f590</app_session_id></Session><ShopOrderOper><Context><context_type></context_type></Context><shop_order_oper_key>123456</shop_order_oper_key><ShopOrderOperChildList><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>1</sample_no><value>12345</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT REVISION</field_name><revision>1</revision><sample_no>1</sample_no><value>K2</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name><revision>1</revision><sample_no>1</sample_no><value>SOMETHING ASSY - FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>2</sample_no><value>300651-500</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update> </FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name> <revision>1</revision><sample_no>4</sample_no><value>AWESOME ASSY-FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData></ShopOrderOperChildList></ShopOrderOper></FNXML>"
| rex "Raw XML: (?<_raw>.*)"
| spath
| rename "FNXML.ShopOrderOper.ShopOrderOperChildList.FieldData.*" as "*"
| eval data=mvzip(field_name,mvzip(value,mvzip(revision,mvzip(sample_no,'Context.context_type'))))
| table FNXML.Session.app_session_id FNXML.Session.user_id FNXML.ShopOrderOper.shop_order_oper_key data
| mvexpand data
| eval data=split(data,",")
| eval field_name=mvindex(data,0)
| eval value=mvindex(data,1)
| eval revision=mvindex(data,2)
| eval sample_no=mvindex(data,3)
| eval "Context.context_type"=mvindex(data,4)
| fields - data

For details on multi value commands/functions please refer to Splunk Documentation:
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/MultivalueEvalFunctions
and other commands as well like http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Mvexpand

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

aoleske
Path Finder

Hi Niketnilay,
Thank you so much for the answer. I have tried running it against our output, and it is not returning any results. I will dig into this more on the 26th. I do have a question about the eval statements. It appears that they will always look for a fixed format of expecting fields to be in certain positions. Our data sometimes has all these fields, sometimes adds fields, and sometimes is missing some of these fields, even within the same event. Some have a single context statement and field_name, and others have up to 50 or more of each. To further complicate matters, there is no way to know what the event will look like ahead of time, and we get thousands of events in an hour. I guess what we are looking for is a way to parse the XML and find the designated fields wherever they are in the XML. Those fields will always remain fixed names, such as context_type and field_name. Thank you! I appreciate you looking at this with us.

0 Karma

aoleske
Path Finder

It dawned on me that it may fail because you just have a snippet of data. I am submitting some full examples of data for you to look at.

0 Karma

aoleske
Path Finder

I have added the data... Thanks for looking at this for me!

0 Karma

aoleske
Path Finder

Hi Niketnilay,
I got the code to work on the specific search I submitted. I had to add the following to the beginning to strip the header so all it sees is the XML code:
| rex mode=sed "s/Action.*Raw XML://g"
| rex "Raw XML: (?<_raw>.)"
| spath
| rename "FNXML.ShopOrderOper.ShopOrderOperChildList.FieldData.
" as "*"
.
.
.
I also added the | rex mode=sed "s/,//g" pipe because the | eval data=split(data,",") causes the field with a "," to throw off the output. The output is restored on the next line. What this means is that context_type winds up being "3" instead of "update". That is because the FieldData.value is "Great Assembly, FRPA". This resolves this problem.

The answer provided seems to only work with the event type submitted originally. I have submitted other event types to give a better idea of what the range of events looks like.

0 Karma

niketn
Legend

I have converted comment to answer. Please refer to the same.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@aoleske, please find below the run anywhere search based on your sample data. You will need to include your existing base search instead of first two pipes i.e. | makeresults and | eval _raw... which cook up the sample data as per the request.

| makeresults
| eval _raw=" Action 12/13/2017 2:21:09 PM: Raw XML: <FNXML><Session><user_id>DATAMGR</user_id><app_session_id>48573028475-73c5-4181-bee2-b5f8e710f590</app_session_id></Session><ShopOrderOper><Context><context_type></context_type></Context><shop_order_oper_key>123456</shop_order_oper_key><ShopOrderOperChildList><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>1</sample_no><value>12345</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT REVISION</field_name><revision>1</revision><sample_no>1</sample_no><value>K2</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name><revision>1</revision><sample_no>1</sample_no><value>SOMETHING ASSY - FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>2</sample_no><value>300651-500</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update> </FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name> <revision>1</revision><sample_no>4</sample_no><value>AWESOME ASSY-FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData></ShopOrderOperChildList></ShopOrderOper></FNXML>"
| rex "Raw XML: (?<_raw>.*)"
| spath
| rename "FNXML.ShopOrderOper.ShopOrderOperChildList.FieldData.*" as "*"
| eval data=mvzip(field_name,mvzip(value,mvzip(revision,mvzip(sample_no,'Context.context_type'))))
| table FNXML.Session.app_session_id FNXML.Session.user_id FNXML.ShopOrderOper.shop_order_oper_key data
| mvexpand data
| eval data=split(data,",")
| eval field_name=mvindex(data,0)
| eval value=mvindex(data,1)
| eval revision=mvindex(data,2)
| eval sample_no=mvindex(data,3)
| eval "Context.context_type"=mvindex(data,4)
| fields - data

For details on multi value commands/functions please refer to Splunk Documentation:
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/MultivalueEvalFunctions
and other commands as well like http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Mvexpand

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

niketn
Legend

@aoleske, you should post Data and Splunk Query with the code button (101010) on Splunk Answers so that special characters do not escape.

While using mvzip command you can define your own delimiter in case default delimiter i.e. comma (,) is already present in your field value . It can be single character or multiple as well. Refer to the following example which generates multi-value data1 and data2 fields with comma present in value. Delimiter for mvzip is ##@@. If you want you can pick up somthing simple like pipe character | as delimiter provided it is not present in the values being zipped.

| makeresults
| eval data1="data,1a;data1b",data2="data,2a;data2b"
| eval data1=split(data1,";"),data2=split(data2,";")
| fields - _time
| eval data=mvzip(data1,data2,"##@@")
| mvexpand data
| eval data=split(data,"##@@")

I dont think sed on raw data is required provided your raw events have Raw XML: always present in the event as the following rex only looks for that anywhere in the event text.

| rex "Raw XML: (?<_raw>.*)"
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

aoleske
Path Finder

Hi Niketnilay,
Thanks for looking into this for me. For whatever reason, it seems I need to remove the header before parsing. When I don't, all that gets returned is the entire XML event. I have decided to go the simple route with this. I just pipe the XML through spath by itself and let it dump all the fields it finds into the output. It grows and adds new fields as they appear. I do remove a bunch of fields in the output (fields -) that are not relevant, just to tone down the amount of scrolling that needs to be done. Once again, thank you for looking at this for me. I will mark this as accepted.

0 Karma

niketn
Legend

I am glad you were able to find your working solution. Do not forget to accept and up vote the comments that helped 🙂

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

aoleske
Path Finder

Happy New Year, Niketnilay! I have clicked on accept for the answer and upvoted the other two comments as they have given insight into the answer as well. Let me know if they are not showing up on your end as accepted.
Thanks again!
Andrew

0 Karma

niketn
Legend

@aoleske Happy new year to you and all your loved ones. I dont think you jave accepted the answer yet. You should see Accept link right below the first answer on top.
Let us know if it is not working from morning.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

aoleske
Path Finder

Sorry, I was out sick for two days. It should be accepted now! Have a great day and thanks again!

0 Karma

niketn
Legend

@aoleske, you can try running steps one by one.

Like check 1 for you would be whether following returns _raw as

 <YourBaseSearch>
 | rex "Raw XML: (?<_raw>.*)"
 | table _raw 

If you dont want to override _raw for your debugging you can rename your field extraction to something new like

 <YourBaseSearch>
| rex "Raw XML: (?<xmlData>.*)"
| spath input=xmlData
| <remainingQueryRemainsTheSameAsMyAnswer>

If your XML may or may not have all the fields, ideally you should have dummy empty nodes already present in XML so that structure remains the same for example <someField1>10</somefield1><someField2></someField2><someField3>20</someField3> instead of <someField1>10</somefield1><someField3>20</someField3>. This would be doable in Splunk while querying as well, however, that would make the query even more expensive. Already multi-value commands and functions like mvexpand and mvzip() are expensive.

However, I would request you to post some two three examples of XMLs. This would indeed be a complicated scenario, however, how to tackle would depend on your data. Meanwhile I will convert my answer to comment so that this question flags as unanswered until further details are provided.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

aoleske
Path Finder

and context_type is a text field.

0 Karma

aoleske
Path Finder

Mayurr98, Here is the info about the context_type:

Context type is found in various FNXML sections, such as ShopOrderOper, ShopOrder, and FieldData. The majority are in FieldData, which is a part of a larger “Session” (another field.) There are numerous FieldData sections. Some have all the following info, some have less, and some add a field or two. Here is the breakdown of where some context_types are layered in. (This is not all inclusive – it changes by event type, but does show generally how it is used):

Session
User_id
App_session_id
shopOrderOper
Context
Context_type (null context_type in this location)
FieldData
Field_name
Revision
Sample_no
Value
Context
Context_type (A lot of different types, such as “update, changestatus, assign, laboroff, create, etc)
FieldDataOptions
FieldData
Field_name
Revision
Sample_no
Value
Context
Context_type (A lot of different types, such as “update, changestatus, assign, laboroff, create, etc)
FieldDataOptions
REPEATS

0 Karma

aoleske
Path Finder

I can also send you the breakdown of the FNXML code in XML Notepad, if that helps.

0 Karma

aoleske
Path Finder

I submitted the code and it is being looked at by the moderators.

0 Karma

aoleske
Path Finder
Action 12/13/2017 2:21:09 PM: Raw XML: <FNXML><Session><user_id>DATAMGR</user_id><app_session_id>48573028475-73c5-4181-bee2-b5f8e710f590</app_session_id></Session><ShopOrderOper><Context><context_type></context_type></Context><shop_order_oper_key>123456</shop_order_oper_key><ShopOrderOperChildList><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>1</sample_no><value>12345</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT REVISION</field_name><revision>1</revision><sample_no>1</sample_no><value>K2</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name><revision>1</revision><sample_no>1</sample_no><value>SOMETHING ASSY - FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT #</field_name><revision>1</revision><sample_no>2</sample_no><value>300651-500</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update> </FieldDataOptions></FieldData><FieldData><field_name>DOCUMENT DESCRIPTION</field_name> <revision>1</revision><sample_no>4</sample_no><value>AWESOME ASSY-FFFP</value><Context><context_type>update</context_type></Context><FieldDataOptions><Update_Is_Create_And_Update>update</Update_Is_Create_And_Update></FieldDataOptions></FieldData></ShopOrderOperChildList></ShopOrderOper></FNXML>
0 Karma

mayurr98
Super Champion

I can help you with the rex... what do you want to extract ?
What is context_type in your raw data

0 Karma

aoleske
Path Finder

Good afternoon! I would greatly appreciate that! I just found the code sample button... Let me attach a small, whittled down version of one of the events.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...