<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473751#M133296</link>
    <description>&lt;P&gt;*Update: Based on the data you provided in another comment I tweaked the regex.&lt;/P&gt;

&lt;P&gt;I would avoid lookaheads and lookbehinds if possible, especially with a big payload.  It's too easy to have an poorly performing or broken regex.&lt;/P&gt;

&lt;P&gt;You also don't need to use the FORMAT command in transforms.conf if your regex is formatted to include the field names with the extractions.&lt;/P&gt;

&lt;P&gt;You can extract the caller and called ID and version fields with two stanzas, one for caller and one for called.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[callerid]
REGEX = caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;callerid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;callerversion&amp;gt;[^\"]+)?\"

[calledid]
REGEX = called\":\s*\{\s*\"id\":\s*\"(?&amp;lt;calledid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;calledversion&amp;gt;[^\"]+)?\"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This was the inline search I used to test it:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval test="{\"info\": {\"eventSource\": \"RPA\", \"sourceType\": \"I\", \"status\": {\"code\": \"0000\", \"msg\": \"Inizio Schedulazione\", \"msgError\": \"\"}, \"transactionId\": \"66083\", \"traceId\": \"124021\", \"timestampStart\": \"2019-10-16T11:34:00.000Z\", \"timestampEnd\": \"null\", \"companyIDCode\": \"01\", \"channelIDCode\": \"\", \"branchCode\": \"\", \"searchFields\": [{\"VDI\": \"WPVRTM2004\"}, {\"PROCESSO\": \"Assegni\"}], \"annotation\": [{\"TIPO\": \"SCHEDULAZIONE\"}, {\"RISORSE POOL\": \"SI\"}], \"caller\": {\"id\": \"VWFM\", \"version\": \"1\", \"acronym\": \"WRPA0\"}, \"called\": {\"id\": \"Assegni\", \"version\": \"1\", \"acronym\": \"WRPA0\"}}, \"payLoad\": {\"output\": {\"encoding\": \"\", \"ccsid\": \"\", \"data\": \"\"}, \"input\": {\"encoding\": \"\", \"ccsid\": \"\", \"data\": \"\"}}}"
| rex field=test "called\":\s*\{\s*\"id\":\s*\"(?&amp;lt;calledid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;calledversion&amp;gt;[^\"]+)?\""
| rex field=test "caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;callerid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;callerversion&amp;gt;[^\"]+)?\""
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Fri, 01 Nov 2019 13:13:51 GMT</pubDate>
    <dc:creator>wenthold</dc:creator>
    <dc:date>2019-11-01T13:13:51Z</dc:date>
    <item>
      <title>How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473750#M133295</link>
      <description>&lt;P&gt;We have to model a regex in order to extract in Splunk (at index time) some fileds from our event. These fields will be used in search using the &lt;CODE&gt;tstats&lt;/CODE&gt; command. The regex will be used in a configuration file in Splunk settings &lt;CODE&gt;transformation.conf&lt;/CODE&gt;. &lt;/P&gt;

&lt;P&gt;The main aspect of the fields we want extract at index time is that they have the same json key but a different father json-key. &lt;/P&gt;

&lt;P&gt;Is it possible modelling this extraction using regex?&lt;/P&gt;

&lt;P&gt;This is an example of Splunk event having the structure described before (json by the way):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{
       "info":{
          "eventSource":"",
          "sourceType":"I/O",
          "status":{
             "code":"",
             "msg":"",
             "msgError":""
          },
          "transactionId":null,
          "traceId":null,
          "timestampStart":"2019-05-16T21:30:55.174Z",
          "timestampEnd":"2019-05-16T21:30:55.174Z",
          "companyIDCode":"",
          "channelIDCode":"",
          "branchCode":"",
          "searchFields":{
             "key_3":"value",
             "key_2":"value",
             "key_1":"value"
          },
          "annotation":{},
          "caller":{
             "id":"",
             "version":"",
             "acronym":""
          },
          "called":{
             "id":"",
             "version":"",
             "acronym":""
          },
             "storage":{
                "id":"",
                "start":"",
                "end":""
             }
          }
       },
       "headers":[],
       "payLoad":{
          "input":{
             "encoding":"1024",
             "ccsid":"1024",
             "data":"dati_in"
          },
          "output":{
             "encoding":"1024",
             "ccsid":"1024",
             "data":"dati_out"
          }
       }
    }
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The attended result is something like that:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;calledid -&amp;gt; aaa&lt;/LI&gt;
&lt;LI&gt;callerversion -&amp;gt; 1&lt;/LI&gt;
&lt;LI&gt;callerid -&amp;gt; bbb&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;We tried something like that&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[calledid] 
REGEX =(?&amp;lt;=called).*"id":"(?P&amp;lt;calledid&amp;gt;.*?)(?=") 
FORMAT = calledid::"$1" 
WRITE_META =true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;but it  dowsn't work cause it matches until the last id he finds. Such as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;":{"id":"","version":"","acronym":""},"storage":{"id":"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 08:26:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473750#M133295</guid>
      <dc:creator>piefragnisp</dc:creator>
      <dc:date>2019-10-31T08:26:52Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473751#M133296</link>
      <description>&lt;P&gt;*Update: Based on the data you provided in another comment I tweaked the regex.&lt;/P&gt;

&lt;P&gt;I would avoid lookaheads and lookbehinds if possible, especially with a big payload.  It's too easy to have an poorly performing or broken regex.&lt;/P&gt;

&lt;P&gt;You also don't need to use the FORMAT command in transforms.conf if your regex is formatted to include the field names with the extractions.&lt;/P&gt;

&lt;P&gt;You can extract the caller and called ID and version fields with two stanzas, one for caller and one for called.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[callerid]
REGEX = caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;callerid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;callerversion&amp;gt;[^\"]+)?\"

[calledid]
REGEX = called\":\s*\{\s*\"id\":\s*\"(?&amp;lt;calledid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;calledversion&amp;gt;[^\"]+)?\"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This was the inline search I used to test it:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval test="{\"info\": {\"eventSource\": \"RPA\", \"sourceType\": \"I\", \"status\": {\"code\": \"0000\", \"msg\": \"Inizio Schedulazione\", \"msgError\": \"\"}, \"transactionId\": \"66083\", \"traceId\": \"124021\", \"timestampStart\": \"2019-10-16T11:34:00.000Z\", \"timestampEnd\": \"null\", \"companyIDCode\": \"01\", \"channelIDCode\": \"\", \"branchCode\": \"\", \"searchFields\": [{\"VDI\": \"WPVRTM2004\"}, {\"PROCESSO\": \"Assegni\"}], \"annotation\": [{\"TIPO\": \"SCHEDULAZIONE\"}, {\"RISORSE POOL\": \"SI\"}], \"caller\": {\"id\": \"VWFM\", \"version\": \"1\", \"acronym\": \"WRPA0\"}, \"called\": {\"id\": \"Assegni\", \"version\": \"1\", \"acronym\": \"WRPA0\"}}, \"payLoad\": {\"output\": {\"encoding\": \"\", \"ccsid\": \"\", \"data\": \"\"}, \"input\": {\"encoding\": \"\", \"ccsid\": \"\", \"data\": \"\"}}}"
| rex field=test "called\":\s*\{\s*\"id\":\s*\"(?&amp;lt;calledid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;calledversion&amp;gt;[^\"]+)?\""
| rex field=test "caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;callerid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;callerversion&amp;gt;[^\"]+)?\""
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 01 Nov 2019 13:13:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473751#M133296</guid>
      <dc:creator>wenthold</dc:creator>
      <dc:date>2019-11-01T13:13:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473752#M133297</link>
      <description>&lt;P&gt;@wenthold  this is how I've configured based on your tweaks:&lt;/P&gt;

&lt;P&gt;in &lt;CODE&gt;transformation.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[caller]
REGEX = caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;callerid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;callerversion&amp;gt;[^\"]+)?\"
[called]
REGEX = called\":\s*\{\s*\"id\":\s*\"(?&amp;lt;calledid&amp;gt;[^\"]+)?\"\,\s*\"version\":\s*\"(?&amp;lt;calledversion&amp;gt;[^\"]+)?\"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;in &lt;CODE&gt;fields.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[caller]
INDEXED = true
[called]
INDEXED = true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;in &lt;CODE&gt;props.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[evento_srctyp]
TRANSFORMS-caller = caller
TRANSFORMS-called = called
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;then I tried to perform a search like &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| tstats values where index=nbp_index_application by callerid
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;but I had no results. Also if I search in the extracted fields I don't found the fields &lt;CODE&gt;callerid&lt;/CODE&gt;, &lt;CODE&gt;callerversion&lt;/CODE&gt;, &lt;CODE&gt;calledid&lt;/CODE&gt;, &lt;CODE&gt;calledversion&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2019 07:50:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473752#M133297</guid>
      <dc:creator>piefragnisp</dc:creator>
      <dc:date>2019-11-06T07:50:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473753#M133298</link>
      <description>&lt;P&gt;If you run btool from the command line do you see the configuration items listed?&lt;BR /&gt;
    /opt/splunk/bin/splunk cmd btool props list evento_srctyp --debug | less&lt;/P&gt;

&lt;P&gt;Replace /opt/splunk with whatever your $SPLUNK_HOME path is. I don't know if "evento_srctyp" is the actual sourcetype or if you masked it, but "splunk cmd btool props list {SOURCETYPE} --debug" will dump out the configuration.&lt;/P&gt;

&lt;P&gt;I wouldn't worry about indexing the field until you have the extractions working.  When you can run the search:&lt;CODE&gt;sourcetype=evento_srctyp | table _time host source callerid callerversion calledid calledversion&lt;/CODE&gt; and get results, then review &lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.0/Data/Configureindex-timefieldextraction" target="_blank"&gt;link text&lt;/A&gt; and then work on the indexed extraction.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 02:50:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473753#M133298</guid>
      <dc:creator>wenthold</dc:creator>
      <dc:date>2020-09-30T02:50:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473754#M133299</link>
      <description>&lt;P&gt;From the deployment server cli do you mean?&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2019 15:08:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473754#M133299</guid>
      <dc:creator>piefragnisp</dc:creator>
      <dc:date>2019-11-06T15:08:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473755#M133300</link>
      <description>&lt;P&gt;Use btool on a system where the configuration has been deployed, not on the deployment server.&lt;/P&gt;

&lt;P&gt;I would probably test the field extraction part on a search head first, then when the extraction syntax has been deployed I would remove it from the testing search head and deploy it to the indexers or heavy forwarders and work on the indexed extraction.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2019 15:18:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473755#M133300</guid>
      <dc:creator>wenthold</dc:creator>
      <dc:date>2019-11-06T15:18:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473756#M133301</link>
      <description>&lt;P&gt;@wenthold  We fixed it adding WRITE_META = true in the transformation.conf; anyway running a search it extract only two fields (keeping the event I gave to you):&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;called (with the caller id):  "Assegni"&lt;/LI&gt;
&lt;LI&gt;caller (with the called id): "VWFM"&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;It seems to forget the &lt;CODE&gt;callerversion&lt;/CODE&gt; and the &lt;CODE&gt;calledversion&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Nov 2019 15:11:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473756#M133301</guid>
      <dc:creator>piefragnisp</dc:creator>
      <dc:date>2019-11-07T15:11:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473757#M133302</link>
      <description>&lt;P&gt;If two of the fields are working then my first guess would be that it's an issue with whitespacing in the regular expression.&lt;/P&gt;

&lt;P&gt;Try running a search that will give you the raw json results and add the following:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| rex field=_raw "caller\"\s*:\s*\{\s*\"id\":\s*\"(?&amp;lt;test_callerid&amp;gt;[^\"]+)?\"\,\s*\"version\"\s*:\s*\"(?&amp;lt;test_callerversion&amp;gt;[^\"]+)?\""
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;See if you get the test_* fields extracted from the json.  If the only thing at this point that needs to be tweaked is the regular expression, you might want to try tweaking it on regex101: &lt;A href="https://regex101.com/r/4lGvKg/1"&gt;https://regex101.com/r/4lGvKg/1&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Nov 2019 14:23:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473757#M133302</guid>
      <dc:creator>wenthold</dc:creator>
      <dc:date>2019-11-08T14:23:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract in Splunk at index time (with tstats) json field with same child-key from different father-key using regex?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473758#M133303</link>
      <description>&lt;P&gt;We solve it using these regex:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[called_id]
REGEX = (?:info":{)(?:[\s\S]*?)(?:"called":{)(?:[\s\S]*?)(?:"id":)(?:(?:(?:")([^"]*)(?:"))|(null))(?:(?:[\}])|,)(?:(?:[^}]*)(?:}))
FORMAT = chiamato::"$1"
WRITE_META = true

[caller_id]
REGEX = (?:info":{)(?:[\s\S]*?)(?:"caller":{)(?:[\s\S]*?)(?:"id":)(?:(?:(?:")([^"]*)(?:"))|(null))(?:(?:[\}])|,)(?:(?:[^}]*)(?:}))
FORMAT = chiamante::"$1"
WRITE_META = true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;in transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TRANSFORMS-callee_id = callee_id
TRANSFORMS-caller_id = caller_id
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and enabling the fields in field.conf&lt;/P&gt;

&lt;P&gt;Hope will help other people.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Dec 2019 09:14:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-in-Splunk-at-index-time-with-tstats-json-field/m-p/473758#M133303</guid>
      <dc:creator>piefragnisp</dc:creator>
      <dc:date>2019-12-04T09:14:16Z</dc:date>
    </item>
  </channel>
</rss>

