Splunk Search

Error when extracting multivalue field from xml data via rex command

jurbain
New Member

Hi

I need to extract multivalue field from an event structured in xml.

<job>

<nameJob>Job1</nameJob>

<executionJob>



2016-10-03
12:31:25
ServerX
JobUser




Step 1

Clean directories A
2016/10/03 12:31:25

0 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:26



DirClean

Clean Directories B
2016/10/03 12:31:26


==========================================



10 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:27



2016-10-03

12:31:27
grc=0


</executionJob>
</job>

I can not use xpath and KV_MODE=xml because some events have special characters which prevents the parsing.

I am trying to use regular expression with the command rex for example extract the steps data

The regular expression "&lt;steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>" works well in a regular expression tester tool (pcre) but when I am trying to do the same with Splunk with the following command:

"basic search | rex field=_raw "&lt;steps>(?<abc>((?!</steps>)[\s\S])*)<\/steps>" max_match=999"

I am getting the error message :

"Streamed search execute failed because: Error in 'rex' command: Regex match error, please check log"

Do you have any idea what is going wrong?

Thanks.
J.

Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The regex string is missing an escape character. Try

\<steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>

That said, unless you need the entire XML indexed, you should consider using a scripted input to parse the XML and extract only the needed fields for indexing. A python parser will be much easier to write and will save license and storage costs by reducing the XML verbosity.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jurbain
New Member

Hi

I am still get the error in the rex command (it does not like [\s\S]).
I will probably follow your recommendation and implement an python parser to extract the information.

Thanks

0 Karma

lukejadamec
Super Champion

Can you list the values you are trying to extract for 'steps' from the event you posted?

0 Karma

jurbain
New Member

Hi Luke

The xml format was not rendered correctly in my question, the xml structure per events is :

    <job> 
        <nameJob>Job1</nameJob> 
        <executionJob> 
            <started> 
                <dateS>2016-10-03</dateS> 
                <timeS>12:31:25</timeS> 
                <serverS>ServerX</serverS> 
                <userS>JobUser</userS> 
            </started> 
            <steps> 
                <nameStep>Step 1</nameStep> 
                <descrStep>Clean directories A </descrStep> 
                <beginTimeStep>2016/10/03 12:31:25</beginTimeStep> 
                <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                      <comment>Execution....</comment> 
                 </comStep> 
                  <comStep> 
                     <comment>0 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:26</endTimeStep> 
            </steps> 
            <steps> 
                <nameStep>Step 2</nameStep> 
                <descrStep>Clean Directories B</descrStep> 
                <beginTimeStep>2016/10/03 12:31:26</beginTimeStep> 
                 <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                       <comment>Execution....</comment> 
                </comStep> 
                <comStep> 
                    <comment>10 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:27</endTimeStep> 
            </steps> 
            <ended> 
                <dateE>2016-10-03</dateE> 
                <timeE>12:31:27</timeE> 
                <globalRcE>grc=0</globalRcE> 
            </ended> 
        </executionJob> 
    </job>

So for 1 job event, I have several steps having a set of properties. And some of these properties, there is also multivalue like which provide the output of each step

My final objective is to get something like

nameJob nameStep    descrStep               beginTimeStep       comment
Job1           Step 1       Clean directories A     2016/10/03 12:31:25     Starting
                                                                            Execution...
                                                                            0 file(s) AND 0 folder(s)
Job1           Step 2       Clean Directories B     2016/10/03 12:31:26     Starting
                                                                            Execution...
                                                                            10 file(s) AND 0 folder(s)
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...