Splunk Search

Error when extracting multivalue field from xml data via rex command

jurbain
New Member

Hi

I need to extract multivalue field from an event structured in xml.

<job>

<nameJob>Job1</nameJob>

<executionJob>



2016-10-03
12:31:25
ServerX
JobUser




Step 1

Clean directories A
2016/10/03 12:31:25

0 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:26



DirClean

Clean Directories B
2016/10/03 12:31:26


==========================================



10 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:27



2016-10-03

12:31:27
grc=0


</executionJob>
</job>

I can not use xpath and KV_MODE=xml because some events have special characters which prevents the parsing.

I am trying to use regular expression with the command rex for example extract the steps data

The regular expression "&lt;steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>" works well in a regular expression tester tool (pcre) but when I am trying to do the same with Splunk with the following command:

"basic search | rex field=_raw "&lt;steps>(?<abc>((?!</steps>)[\s\S])*)<\/steps>" max_match=999"

I am getting the error message :

"Streamed search execute failed because: Error in 'rex' command: Regex match error, please check log"

Do you have any idea what is going wrong?

Thanks.
J.

Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The regex string is missing an escape character. Try

\<steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>

That said, unless you need the entire XML indexed, you should consider using a scripted input to parse the XML and extract only the needed fields for indexing. A python parser will be much easier to write and will save license and storage costs by reducing the XML verbosity.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jurbain
New Member

Hi

I am still get the error in the rex command (it does not like [\s\S]).
I will probably follow your recommendation and implement an python parser to extract the information.

Thanks

0 Karma

lukejadamec
Super Champion

Can you list the values you are trying to extract for 'steps' from the event you posted?

0 Karma

jurbain
New Member

Hi Luke

The xml format was not rendered correctly in my question, the xml structure per events is :

    <job> 
        <nameJob>Job1</nameJob> 
        <executionJob> 
            <started> 
                <dateS>2016-10-03</dateS> 
                <timeS>12:31:25</timeS> 
                <serverS>ServerX</serverS> 
                <userS>JobUser</userS> 
            </started> 
            <steps> 
                <nameStep>Step 1</nameStep> 
                <descrStep>Clean directories A </descrStep> 
                <beginTimeStep>2016/10/03 12:31:25</beginTimeStep> 
                <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                      <comment>Execution....</comment> 
                 </comStep> 
                  <comStep> 
                     <comment>0 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:26</endTimeStep> 
            </steps> 
            <steps> 
                <nameStep>Step 2</nameStep> 
                <descrStep>Clean Directories B</descrStep> 
                <beginTimeStep>2016/10/03 12:31:26</beginTimeStep> 
                 <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                       <comment>Execution....</comment> 
                </comStep> 
                <comStep> 
                    <comment>10 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:27</endTimeStep> 
            </steps> 
            <ended> 
                <dateE>2016-10-03</dateE> 
                <timeE>12:31:27</timeE> 
                <globalRcE>grc=0</globalRcE> 
            </ended> 
        </executionJob> 
    </job>

So for 1 job event, I have several steps having a set of properties. And some of these properties, there is also multivalue like which provide the output of each step

My final objective is to get something like

nameJob nameStep    descrStep               beginTimeStep       comment
Job1           Step 1       Clean directories A     2016/10/03 12:31:25     Starting
                                                                            Execution...
                                                                            0 file(s) AND 0 folder(s)
Job1           Step 2       Clean Directories B     2016/10/03 12:31:26     Starting
                                                                            Execution...
                                                                            10 file(s) AND 0 folder(s)
0 Karma
Get Updates on the Splunk Community!

OpenTelemetry for Legacy Apps? Yes, You Can!

This article is a follow-up to my previous article posted on the OpenTelemetry Blog, "Your Critical Legacy App ...

UCC Framework: Discover Developer Toolkit for Building Technology Add-ons

The Next-Gen Toolkit for Splunk Technology Add-on Development The Universal Configuration Console (UCC) ...

.conf25 Community Recap

Hello Splunkers, And just like that, .conf25 is in the books! What an incredible few days — full of learning, ...