Splunk Search

How to match the last occurance of the regex?

mmdacutanan
Explorer

I have got a splunk query that searches for the string 'PS1234_IVR_DM' and once found, perform a rex on the field called 'value'. My problem is that in a single log file (xml format), PS1234_IVR_DM can appear more than once which means I can get more than one possible value for the field 'value'. In my query below, I do use max_match=0 which captures all occurrences of the string. But then when I remove it, it only gives my the first occurrence of the string. I am only interested in counting the last occurrence.

index=abc sourcetype=xml_logs applicationName=IVR PS1234_IVR_DM | rex "id=\"PS1234_IVR_DM.*?value=\"(?.*?)\"" max_match=0| timechart span=1h dc(corID) by input usenull=f

Can anybody suggest a way to do this, please?

Thanks in advance!

Tags (2)
0 Karma

deepashri_123
Motivator

Hey@mmdacutanan,

You can try something like dis:
index=abc sourcetype=xml_logs applicationName=IVR PS1234_IVR_DM|rex "id=\"PS1234_IVR_DM.?value=\"(?.?)\"" max_match=0| eval test=mvindex(input,-1) | timechart span=1h dc(corID) by test usenull=f

Let me know if this helps!!!

0 Karma

MuS
Legend

Hi mmdacutanan,

sample events would be helpful, but basically you can use the very expensive negative lookahead regex to get the last occurrence of PS1234_IVR_DM like this :

(\bPS1234_IVR_DM\b)(?!.+\b(?<input>\1)\b)

The above is the regex you can use.

Hope this helps ...

cheers, MuS

0 Karma

mmdacutanan
Explorer

Here are snippet of the xml log file. You will see there are 2 lines (one near the top, the other near the bottom) that contains PS1234_IVR_DM. The first line has 'value="spanish"'. And then the second occurrence of PS1234_IVR_DM has 'value="nomatch"'. I only want to count the value the last one which is "nomatch".

<dialog duration="13.868" error="0" id="PS1234_IVR_DM" index="25" language="en-US" noinput="0" nomatch="0" speaker="0" startTime="5.944" status="ok" value="spanish">
  <DialogTurns>
    <turn confidence="0.430000" duration="13.868" inputmode="voice" startTime="5.944" turnindex="1" value="spanish">
      <details>({interpretation:{GLOBAL:"spanish", delta:"-400"}, interpretation$:[1], utterance:"espanol", inputmode:"voice", confidence:0.4300000071525574, slotconf:{GLOBAL:0.4300000071525574, delta:0.4300000071525574}, marktime:undefined, markname:undefined})</details>
      <prompts>
        <prompt name="http://10.123.456.789:8080//Postpaid_AudioData/VS_APPL/en-US/0/PS1234_I_02.wav" type="audio"/>
      </prompts>
      <grammars>
        <grammar name="http://10.123.456.789:8080/Disambig_Instructions/vxml/grammars/en-US/PS1234_DM.grxml"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/dtmf_global.jsp?keys=7,repeat"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/en-US/universals.grxml#repeat"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/en-US/universals.grxml#spanish"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/choice_digits.jsp?keys=1&amp;items=selfserve"/>
      </grammars>
    </turn>
    <turn confidence="0.420000" duration="0.000" inputmode="voice" startTime="19.812" turnindex="2" type="confirm" value="yes">
      <details>({confidence:0.41999998688697815, utterance:"yeah", inputmode:"voice", interpretation:{CHOICE:"yes"}, interpretation$:[1], recording:undefined, recordingduration:undefined, recordingsize:undefined, marktime:undefined, markname:undefined})</details>
      <prompts>
        <prompt name="http://10.123.456.789:8080//Postpaid_AudioData/VS_APPL/en-US/0/PS4567.wav" type="audio"/>
        <prompt name="http://10.123.456.789:8080//Postpaid_AudioData/VS_APPL/en-US/0/010.wav" type="audio"/>
      </prompts>
      <grammars>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/en-US/confirm.grxml"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/confirmDtmf.grxml"/>
      </grammars>
    </turn>
  </DialogTurns>
</dialog>
<prompt id="PS1234_ExitPrompts_PP" index="26" language="es-US" speaker="0" startTime="2018-08-06 16:59:09.623">
  <prompts>
    <prompt name="PS1234_E_05" type="audio"/>
  </prompts>
</prompt>
<dialog duration="5.148" error="0" id="PS1234_IVR_DM" index="27" language="es-US" noinput="0" nomatch="1" speaker="0" startTime="19.828" status="nomatch" value="nomatch">
  <DialogTurns>
    <turn duration="5.148" startTime="19.828" turnindex="1" value="nomatch">
      <prompts>
        <prompt name="http://10.123.456.789:8080//Postpaid_AudioData/VS_APPL/es-US/0/PS1234_I_02.wav" type="audio"/>
        <prompt name="http://10.123.456.789:8080//Postpaid_AudioData/VS_APPL/es-US/0/0.wav" type="audio"/>
      </prompts>
      <grammars>
        <grammar name="http://10.123.456.789:8080/Disambig_Instructions/vxml/grammars/es-US/PS1234_DM.grxml"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/dtmf_global.jsp?keys=7,repeat"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/es-US/universals.grxml#repeat"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/es-US/universals.grxml#english"/>
        <grammar name="http://10.123.456.789:8080/Postpaid_CommonData/vxml/grammars/choice_digits.jsp?keys=1&amp;items=selfserve"/>
      </grammars>
    </turn>
  </DialogTurns>
</dialog>
0 Karma

MuS
Legend

Okay, that's a different requirement now ... is it:

  1. you want the last occurrence of a string?
  2. you want the last occurrence of a string followed by value="nomatch"?
0 Karma

mmdacutanan
Explorer

Hello again MuS,

The string PS1234_IVR_DM will always be followed by a field called 'value' (which I am renaming as 'input' in my rex expression). The actual value for the 'value' field doesn't matter. What matters is that I am only counting the by the value of the field 'value' where the string PS1234_IVR_DM is last seen. Hope that helps?

0 Karma

FrankVl
Ultra Champion

Not sure about the structure of your data, but shouldn't each ... section be a separate event in Splunk? Or do these multiple occurences of really belong together in 1 event for some reason?

0 Karma

mmdacutanan
Explorer

Hi Frank,

Exactly, those multiple occurrences belong to one event. In my case, 1 event is actually one xml log file. And that's why per event, I only want to count the last occurrence of the string and then move on to next event.

Thank you!

0 Karma

mmdacutanan
Explorer

Apologies, when I copy pasted my splunk query initially, it looks different so I re-posted it again below:

index=abc sourcetype=xml_logs applicationName=IVR PS1234_IVR_DM | rex "id=\"PS1234_IVR_DM.*?value=\"(?<input>.*?)\"" max_match=0| timechart span=1h dc(corID) by input usenull=f
0 Karma
Get Updates on the Splunk Community!

Buttercup Games Tutorial Extension - part 9

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Buttercup Games Tutorial Extension - part 8

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Introducing the Splunk Developer Program!

Hey Splunk community! We are excited to announce that Splunk is launching the Splunk Developer Program in ...