Dashboards & Visualizations

Parsing simple XML fields

altink
Builder

Hello,

I need to parse the fields of the XML below:

1001
vulnerability name 001
2
Audit
0
successfully completed
0
USER1, USER2
xxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxx

The data above comes to Splunk via a TCP Input, one XML like the above per each event (record).
The fields (and content) I need are obviously the VLN_s.

Index-time is preferred, but search-time is also OK,

Can someone help ?

at your disposal for further details,

regards
Altin

Tags (2)
0 Karma
1 Solution

niketn
Legend

@altink... You might have to repost the XML with 101010 button to mark the same as code so that it does not get escaped.

In Splunk you can use spath or xpath to parse XML data. Is the entire raw data(event) itself XML? Or do you get part of your data as XML?

If your entire data is XML, you can enable the KV_MODE=xml while defining sourcetype in your props.conf so that Splunk extracts the field automatically for you. (http://docs.splunk.com/Documentation/Splunk/6.5.3/Admin/Propsconf)

If not you can refer to spath command and use rex to first extract only XML data and then parse. Refer to documentation http://docs.splunk.com/Documentation/Splunk/6.5.3/SearchReference/Spath

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

altink
Builder
<CONTROL> 
<VLN_ID>1001</VLN_ID>
<VLN_NAME>vulnerability name 001</VLN_NAME>
<VLN_SEVERITY>2</VLN_SEVERITY>
<VLN_CATEGORY>Audit</VLN_CATEGORY> 
<VLN_SCAN_CODE>0</VLN_SCAN_CODE>
<VLN_SCAN_MESSAGE>successfully completed</VLN_SCAN_MESSAGE>
<VLN_CTRL_FIND>0</VLN_CTRL_FIND>
<VLN_CTRL_SUMMARY>ALDO1, ALTIN1</VLN_CTRL_SUMMARY>
<VLN_CTRL_OUTPUT>xxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxx</VLN_CTRL_OUTPUT>
</CONTROL>
0 Karma

niketn
Legend

@altink... You might have to repost the XML with 101010 button to mark the same as code so that it does not get escaped.

In Splunk you can use spath or xpath to parse XML data. Is the entire raw data(event) itself XML? Or do you get part of your data as XML?

If your entire data is XML, you can enable the KV_MODE=xml while defining sourcetype in your props.conf so that Splunk extracts the field automatically for you. (http://docs.splunk.com/Documentation/Splunk/6.5.3/Admin/Propsconf)

If not you can refer to spath command and use rex to first extract only XML data and then parse. Refer to documentation http://docs.splunk.com/Documentation/Splunk/6.5.3/SearchReference/Spath

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

altink
Builder

got it - string quotation needed.

thank you for the pure SQL,
:-)

regards
Altin

0 Karma

altink
Builder

the macro is this:

index="my_index" sourcetype=my_SourceType | 
rename CONTROL.VLN_ID AS VLN_ID 
CONTROL.VLN_NAME AS VLN_NAME 
CONTROL.VLN_SEVERITY AS VLN_SEVERITY 
CONTROL.VLN_CATEGORY AS VLN_CATEGORY 
CONTROL.VLN_SCAN_CODE AS VLN_SCAN_CODE CONTROL.VLN_SCAN_MESSAGE AS VLN_SCAN_MESSAGE CONTROL.VLN_CTRL_FIND AS VLN_CTRL_FIND CONTROL.VLN_CTRL_SUMMARY AS VLN_CTRL_SUMMARY CONTROL.VLN_CTRL_OUTPUT AS VLN_CTRL_OUTPUT 
0 Karma

altink
Builder

... and it did work with a multiple conditions:

`my_MACRO` | where VLN_ID=1001 and VLN_SEVERITY=2 

but only as long as I search number fields only.

If I search a string fields (or at least non-number) - I see no results, although
the condition I am setting does exist. like:

`my_MACRO` | where VLN_ID=1001 and VLN_SEVERITY=2 and VLN_CATEGORY=Audit

or (Audit quoted)

`my_MACRO` | where VLN_ID=1001 and VLN_SEVERITY=2 and VLN_CATEGORY='Audit'

What is the problem on the non-number fields ?

all my raw events have an VLN_CATEGORY='Audit'

thank you
Altin

0 Karma

altink
Builder

... but I cannot search in macro with field conditions, like:

`MY_MACRO` VLN_ID=1001

Error in 'rename' command: Usage: rename [old_name AS/TO/-> new_name]+

I do get the full result (field VLN_ID included) when simply searching:

`MY_MACRO`

How can I have some search that I can permanently save and then work (filter, stats...) on its columns ?

regards
Altin

0 Karma

niketn
Legend

Since you have not posted the macro... I think you are trying to have macro perform all the field renames for you. If it is like that, I would expect the query to be like the following:

`MY_MACRO`
| where VLN_ID=1001

However, the auto-extracted XML field names are fully qualified based on the XML DOM, so you should not rename the field unless you are trying to alias the field for correlating with some other data source. Even though field names are long, you can put them to macros/calculated fields so that your actual query is smaller and easily reusable.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

If this helps, kindly Accept the answer and up-vote any comments that may have helped you find your solution.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

altink
Builder

Thank you very much,
that worked

0 Karma

niketn
Legend

Glad it worked. Please go ahead and Accept this Answer so that it gets marked as answered!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

altink
Builder

done with simply 'my_MACRO' .
I tried this before, I must have missed something

0 Karma

altink
Builder

I do use back tick character ( ` ) around macro name

0 Karma

altink
Builder

Used rename...

index="my_index" sourcetype=my_Source | rename 
CONTROL.VLN_ID AS VLN_ID, 
CONTROL.VLN_NAME AS VLN_NAME
CONTROL.VLN_SEVERITY AS VLN_SEVERITY
CONTROL.VLN_CATEGORY AS VLN_CATEGORY
CONTROL.VLN_SCAN_CODE AS VLN_SCAN_CODE
CONTROL.VLN_SCAN_MESSAGE AS VLN_SCAN_MESSAGE
CONTROL.VLN_CTRL_FIND AS VLN_CTRL_FIND
CONTROL.VLN_CTRL_SUMMARY AS VLN_CTRL_SUMMARY
CONTROL.VLN_CTRL_OUTPUT AS VLN_CTRL_OUTPUT

to rename the fields.
Since the code is long, I still need to use a Macro to have something as a View (rdbms-sorry!).
I created the macro, tested again the search string, gave permission to app (inside which I do the search).
when I call it in a search:

 index="my_index" | `my_MACRO`

I receive error:

Search Factory: Unknown search command 'index'. 

can you please advise?

regards
Altin

0 Karma

altink
Builder

I tried the kv_mode=xml and got the fields searching:

index=xxxxx

The above inside the app in which the source-type resides.

I got the (kv_mode) XML fields named as in the following:
CONTROL.VLN_ID
CONTROL.VLN_CATEGORY
..............................................

Is there any way to remove the "CONTROL." part ?

regards,
Altin

0 Karma

altink
Builder

Thank you very much for the advise.

Initially I tried with Macros. Created a new one and tried to call it in search,
as it is advised on:

https://docs.splunk.com/Documentation/SplunkCloud/6.5.1612/Knowledge/Usesearchmacros

but it seems the info there is not enough, I tried the first example given
Quoted
" If you have a search macro named mymacro it looks like this when referenced in a search:

 sourcetype=access_* | `mymacro`"

at no result.
error is:

Error in 'SearchParser': Missing a search command before '''. Error at position '29' of search query 'search sourcetype=access_* | 'MY_MACRO''. 

Am I missing something here ? Bu sure I am, but please tell me what? I have all my search inside the macro - what am I supposed to add before ?

0 Karma

niketn
Legend

@Altin, I had requested for KV_MODE=xml. This will not perform index time field extraction rather search time field extraction based on sourcetype. You need to define this for your sourcetype. (PS: As best practice you should include sourcetype in your base search as well).

Search for the following in the documentation http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf

KV_MODE = [none|auto|auto_escaped|multi|json|xml]

By default KV_MODE is auto which extracts key value pairs separate by = sign.

Splunk admin should be able to easily perform above change to KV_MODE as XML.

Another option for you in case you don't want to rewrite above spath query every time, would be to save the same as a Macro from Settings > Advanced search > Search Macro. That way you can call the same as a function any where in your search/report/dashboard/alert.

Let me convert this to answer so that you can test and accept the same once it has helped you resolve your issue.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

altink
Builder

I tried:

index="index_test_01"
| spath output=VLN_ID path=CONTROL.VLN_ID 
| spath output=VLN_NAME path=CONTROL.VLN_NAME 
| spath output=VLN_SEVERITY path=CONTROL.VLN_SEVERITY 
| spath output=VLN_CATEGORY path=CONTROL.VLN_CATEGORY 
| spath output=VLN_SCAN_CODE path=CONTROL.VLN_SCAN_CODE 
| spath output=VLN_SCAN_MESSAGE path=CONTROL.VLN_SCAN_MESSAGE 
| spath output=VLN_CTRL_FIND path=CONTROL.VLN_CTRL_FIND 
| spath output=VLN_CTRL_SUMMARY path=CONTROL.VLN_CTRL_SUMMARY
| spath output=VLN_CTRL_OUTPUT path=CONTROL.VLN_CTRL_OUTPUT

| table VLN_ID VLN_NAME VLN_SEVERITY VLN_CATEGORY VLN_SCAN_CODE VLN_SCAN_MESSAGE VLN_CTRL_FIND VLN_CTRL_SUMMARY VLN_CTRL_OUTPUT

and got the fields as I needed. However it looks un-handy to do this every time in every search.
And I don't know how practical will be (and how) to put search conditions on this fields, or refer it
as a whole to build dashboard panels and reports

It look the best would be to extract the fields at index-time,

However Splunk Doc says:
NOTE:
We do not recommend adding to the set of fields that are extracted
at index time unless it is absolutely necessary because there are
negative performance implications.

So it looks it could be in apps props.conf, but cannot find how

can you help ?

regards
Altin

0 Karma

altink
Builder

Sorry that didn't reply, not yet used to with the forum.
Yes the event is all XML.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...