Splunk Search

XML Field Extraction

Explorer

Hi,

Here's a sample of my XML data. I want to get the username. I tried a field alias, but that's not working, nor is field extraction. When I open the field extractor tool, the data is truncated after the caller_profile tag. When I look at the event, it's all there. It's only when I try to use the field extractor that it gets truncated.

props.conf:
[conf_cdr_xml]
TRUNCATE = 0
KV_MODE = xml

date sample:


1235551234-101
hostname.com
8000
20
1510329526
1510329534


1510329526
1510329534

true
true
false
false


1235551010
XML
Joe Boss
1235551010


1235551010

10.0.1.1

1235551234;conf=101;mod;tone=NO_SOUNDS
038fa0ce-c630-11e7-938f-b3cdceb36fa4
mod_sofia
public
sofia/internal/1235551010@10.10.1.1





0 Karma
1 Solution

SplunkTrust
SplunkTrust

@mwcooley, so by KV_MODE=xml not working do you mean Search Time Field discovery in smart/verbose mode is not working? The following table command does not work

<YourBaseSearch>
|  table *username

Have you also tried

<YourBaseSearch>
| spath
|  table *username

In case XML parsing is not working and you are able to see data with <username>1235551010</username>, then try the following rex command and see how it behaves:

<YourBaseSearch>
|  rex "<username>(?<username>[^\<]+)</username>"
|  table username
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

Champion

so your events are already broken correctly and you're just working on field extractions? If so, then the kv_mode setting should be on your search head. Is it there?

0 Karma

Explorer

ah, OK. I don't have access to the search head, only the forwarder. i thought I could put it in props.conf there and make it work.

0 Karma

SplunkTrust
SplunkTrust

@mwcooley, so by KV_MODE=xml not working do you mean Search Time Field discovery in smart/verbose mode is not working? The following table command does not work

<YourBaseSearch>
|  table *username

Have you also tried

<YourBaseSearch>
| spath
|  table *username

In case XML parsing is not working and you are able to see data with <username>1235551010</username>, then try the following rex command and see how it behaves:

<YourBaseSearch>
|  rex "<username>(?<username>[^\<]+)</username>"
|  table username
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

Explorer

@niketnilay, that's closer. | spath | table *username works. I get the usernames even when there are multiples. The rex command only returns the first .

If I use spath, how do I get the username into eventstats?

0 Karma

New Member

in case xml using above solution

getting only single result

0 Karma

SplunkTrust
SplunkTrust

@mintucs, you might have to post a separate question with your sample xml data and extraction that you are using. If applicable your props.conf and transforms.conf as well. You would also need to mask any sensitive information while posting your question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

SplunkTrust
SplunkTrust

@mwcooley, your sample data had only one username in the event. By the rex command only returning first match, do you mean that single event may have multiple usernames? Can you add such sample?

In any case, you can use max_match=0 in the rex command to return multiple matches within single event. username field will be treated as multivalued.

<YourBaseSearch>
|  rex "<username>(?<username>[^<]+)<\/username>" max_match=0
|  table username

What do you mean by eventstats? What is your intended output and which fields do you want to use and what is the desired output? In other words give the desired field names and expected values in tabular format.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Explorer

I want to count the users, so i was trying to feed the usernames to eventstats. the final search is in a comment below.

0 Karma

Explorer

thanks @niketnilay, using max_match=0 worked. here's the final search (turned out i needed callerID, not username):

index="myIndex" sourcetype="conf_cdr_xml" |
eval Conf_Start=strftime(start_time,"%H:%M:%S %m/%d/%y") |

eval Conf_End=strftime(end_time,"%H:%M:%S %m/%d/%y") |

eval Duration = tostring((end_time - start_time), "Duration") |
rex "(?[^<]+)<\/caller_id_name>" max_match=0 |
eventstats count(caller_id_name) as Attendees by Conf_Start |
table confName Conf_Start Conf_End Duration Attendees

An, here's the xml with multiple usernames/callerIDs:

<?xml version="1.0"?>
<cdr>
  <conference>
    <name>1235551234-101</name>
    <hostname>hostname.com</hostname>
    <rate>8000</rate>
    <interval>20</interval>
    <start_time type="UNIX-epoch">1510329526</start_time>
    <end_time endconference_forced="false" type="UNIX-epoch">1510329534</end_time>
    <members>
      <member type="caller">
        <join_time type="UNIX-epoch">1510329526</join_time>
        <leave_time type="UNIX-epoch">1510329534</leave_time>
        <flags>
          <is_moderator>true</is_moderator>
          <end_conference>true</end_conference>
          <was_kicked>false</was_kicked>
          <is_ghost>false</is_ghost>
        </flags>
        <caller_profile>
          <username>1235551010</username>
          <dialplan>XML</dialplan>
          <caller_id_name>Joe Boss</caller_id_name>
          <caller_id_number>1235551010</caller_id_number>
          <callee_id_name></callee_id_name>
          <callee_id_number></callee_id_number>
          <ani>1235551010</ani>
          <aniii></aniii>
          <network_addr>10.0.1.1</network_addr>
          <rdnis></rdnis>
          <destination_number>1235551234;conf=101;mod;tone=NO_SOUNDS</destination_number>
          <uuid>038fa0ce-c630-11e7-938f-b3cdceb36fa4</uuid>
          <source>mod_sofia</source>
          <context>public</context>
          <chan_name>sofia/internal/1235551010@10.10.1.1</chan_name>
        </caller_profile>
      </member>
      <member type="caller">
        <join_time type="UNIX-epoch">1510329526</join_time>
        <leave_time type="UNIX-epoch">1510329534</leave_time>
        <flags>
          <is_moderator>true</is_moderator>
          <end_conference>true</end_conference>
          <was_kicked>false</was_kicked>
          <is_ghost>false</is_ghost>
        </flags>
        <caller_profile>
          <username>1235557721</username>
          <dialplan>XML</dialplan>
          <caller_id_name>Bob</caller_id_name>
          <caller_id_number>1235557721</caller_id_number>
          <callee_id_name></callee_id_name>
          <callee_id_number></callee_id_number>
          <ani>1235557721</ani>
          <aniii></aniii>
          <network_addr>10.0.1.2</network_addr>
          <rdnis></rdnis>
          <destination_number>1235551234;conf=101;mod;tone=NO_SOUNDS</destination_number>
          <uuid>038fa0ce-c630-11e7-938f-b3cdceb36fa4</uuid>
          <source>mod_sofia</source>
          <context>public</context>
          <chan_name>sofia/internal/1235557721@10.10.1.2</chan_name>
        </caller_profile>
      </member>
      </members>
    <rejected></rejected>
  </conference>
</cdr>

SplunkTrust
SplunkTrust

Glad it worked. Do compare stats and eventstats and see which one you actually need.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Path Finder

Try this in the props.conf

[conf_cdr_xml]
KV_MODE = xml
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = ((--NEVER--))
MAX_EVENTS = 1000
NO_BINARY_CHECK = true
pulldown_type = true

0 Karma

Explorer

Hey. Didn't work. I forward to heavy forwarders which forward to indexes. I'm worried something in the heavy forwarder is messing me up. KV_MODE isn't working either. But then, i'm a complete noob.

0 Karma

Explorer

dang it. the preview showed my xml as text. one more try:

<cdr>
  <conference>
    <name>1235551234-101</name>
    <hostname>hostname.com</hostname>
    <rate>8000</rate>
    <interval>20</interval>
    <start_time type="UNIX-epoch">1510329526</start_time>
    <end_time endconference_forced="false" type="UNIX-epoch">1510329534</end_time>
    <members>
      <member type="caller">
        <join_time type="UNIX-epoch">1510329526</join_time>
        <leave_time type="UNIX-epoch">1510329534</leave_time>
        <flags>
          <is_moderator>true</is_moderator>
          <end_conference>true</end_conference>
          <was_kicked>false</was_kicked>
          <is_ghost>false</is_ghost>
        </flags>
        <caller_profile>
          <username>1235551010</username>
          <dialplan>XML</dialplan>
          <caller_id_name>Joe Boss</caller_id_name>
          <caller_id_number>1235551010</caller_id_number>
          <callee_id_name></callee_id_name>
          <callee_id_number></callee_id_number>
          <ani>1235551010</ani>
          <aniii></aniii>
          <network_addr>10.0.1.1</network_addr>
          <rdnis></rdnis>
          <destination_number>1235551234;conf=101;mod;tone=NO_SOUNDS</destination_number>
          <uuid>038fa0ce-c630-11e7-938f-b3cdceb36fa4</uuid>
          <source>mod_sofia</source>
          <context>public</context>
          <chan_name>sofia/internal/1235551010@10.10.1.1</chan_name>
        </caller_profile>
      </member>
    </members>
    <rejected></rejected>
  </conference>
</cdr>
0 Karma