<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691367#M114997</link>
    <description>&lt;P&gt;1. Actual Data looks like below. Data in string format " { } "&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Actual json data.png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/31425iED0BAB9139FBA83C/image-size/large?v=v2&amp;amp;px=999" role="button" title="Actual json data.png" alt="Actual json data.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2. From UI using the below worked to some extent. Data string to list [ { } ]&lt;/SPAN&gt;&lt;STRONG&gt;&lt;SPAN&gt;&lt;BR /&gt;| rex mode=sed "s/(\"Data\":\s+)\"/\1[/g s/(\"Data\":\s+\[{.*})\"/\1]/g s/\\\\\"/\"/g"&lt;BR /&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN&gt;Issue now is it is not automatically identifying the key value pairs inside the Data Dictionary, irrespective of the setting kv_mode =json.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="working but automatic kv isnot getting detected..png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/31426iB45D11BF13C411B9/image-size/large?v=v2&amp;amp;px=999" role="button" title="working but automatic kv isnot getting detected..png" alt="working but automatic kv isnot getting detected..png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 23 Jun 2024 15:12:06 GMT</pubDate>
    <dc:creator>vn_g</dc:creator>
    <dc:date>2024-06-23T15:12:06Z</dc:date>
    <item>
      <title>IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691353#M114990</link>
      <description>&lt;PRE&gt;Input Event&amp;nbsp;:&amp;nbsp;[so much data exists in the same single line ] ,"Comments": "New alert", &lt;STRONG&gt;&lt;EM&gt;"Data": "{\"etype\":\"MalwareFamily\",\"at\":\"2024-06-21T11:34:07.0000000Z\",\"md\":\"2024-06-21T11:34:07.0000000Z\",\"Investigations\":[{\"$id\":\"1\",\"Id\":\"urn:ZappedUrlInvestigation:2cc87ae3\",\"InvestigationStatus\":\"Running\"}],\"InvestigationIds\":[\"urn:ZappedUrlInvestigation:2cc8782d063\"],\"Intent\":\"Probing\",\"ResourceIdentifiers\":[{\"$id\":\"2\",\"AadTenantId\":\"2dfb29-729c918\",\"Type\":\"AAD\"}],\"AzureResourceId\":null,\"WorkspaceId\":null,\"Metadata\":{\"CustomApps\":null,\"GenericInfo\":null},\"Entities\":[{\"$id\":\"3\",\"MailboxPrimaryAddress\":\"abc@gmail.com\",\"Upn\":\"abc@gmail.com\",\"AadId\":\"6eac3b76357\",\"RiskLevel\":\"None\",\"Type\":\"mailbox\",\"Urn\":\"urn:UserEntity:10338af2b6c\",\"Source\":\"TP\",\"FirstSeen\":\"0001-01-01T00:00:00\"}, \"StartTimeUtc\": \"2024-06-21T10:12:37\", \"Status\": \"Investigation Started\"}","EntityType": "MalwareFamily", [so much data exists in the same single line ]&lt;/EM&gt;&lt;/STRONG&gt;&lt;/PRE&gt;&lt;P&gt;In a single line, there exists so much data,&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;I want to substitue(\") with (") only that falls between Data dictionary value, nothing before and nothing after. sample regex :&lt;SPAN&gt;&amp;nbsp;&lt;A href="https://regex101.com/r/Gsfaay/1" target="_blank" rel="nofollow noopener noreferrer"&gt;https://regex101.com/r/Gsfaay/1&lt;SPAN&gt;&amp;nbsp;( highlighted data only in group 4 should be modified.)&lt;/SPAN&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;And the Dictionary value is enclosed between quotes(as string) want it to be replaced by []braces as list ( group 3 and 6 )&lt;PRE&gt;Ouptut Required : [so much data exists in the same single line ],"Comments": "New alert", "Data": [{"etype":"MalwareFamily", so on,"Status":"Investigation Started"}],"EntityType": "MalwareFamily", [so much data exists in the same single line ]&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Trials :&amp;nbsp;&lt;/P&gt;&lt;P&gt;[testing_logs]&lt;BR /&gt;SEDCMD-DataJson = s/\\\"/\"/g s/"Data": "{"/"Data": \[{"/g s/("Data": \[{".*})",/$1],/g&lt;BR /&gt;INDEXED_EXTRACTIONS = json&lt;BR /&gt;KV_MODE = json&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I tried it in the multiple steps as mentioned in my above example, but In splunk sedcmd works on the entire _raw value. I shouldnt apply it globally&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;1.&amp;nbsp;&lt;A href="https://regex101.com/r/0g2bcL/1" target="_blank" rel="nofollow noopener noreferrer"&gt;regex101.com/r/0g2bcL/1&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2.&amp;nbsp;&lt;A href="https://regex101.com/r/o3eFgJ/1" target="_blank" rel="nofollow noopener noreferrer"&gt;regex101.com/r/o3eFgJ/1&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;3.&amp;nbsp;&lt;A href="https://regex101.com/r/D7Of0v/1" target="_blank" rel="nofollow noopener noreferrer"&gt;regex101.com/r/D7Of0v/1&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;only issue with the first regex, it shouldnt be applied globally on entire event value, it should be applying only between data dictionary value.&lt;/SPAN&gt;&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Sat, 22 Jun 2024 16:38:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691353#M114990</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-22T16:38:55Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691354#M114991</link>
      <description>&lt;P&gt;Try something like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex mode=sed "s/(Data\": )\"/\1[/g s/}\"(, \"EntityType)/}]\1]/g s/\\\\\"/\"/g"&lt;/LI-CODE&gt;</description>
      <pubDate>Sat, 22 Jun 2024 17:26:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691354#M114991</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2024-06-22T17:26:23Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691357#M114992</link>
      <description>&lt;P&gt;It is not that you will always have Entity Value next to data. It is random.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 05:21:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691357#M114992</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-23T05:21:17Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691359#M114993</link>
      <description>&lt;P&gt;It is unlikely to be random, since it is generated by a system. There is likely to be some pattern to it, but if you do not share that information, it is unlikely that we will be able to guess it, and therefore would be wasting our time attempting to provide a solution until you provide sufficient relevant details.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 10:00:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691359#M114993</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2024-06-23T10:00:50Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691363#M114996</link>
      <description>&lt;P&gt;&lt;BR /&gt;1. &lt;A href="https://regex101.com/r/jPZ4yy/1" target="_blank"&gt;https://regex101.com/r/jPZ4yy/1&lt;/A&gt;&lt;BR /&gt;2. &lt;A href="https://regex101.com/r/PmwS2C/1" target="_blank"&gt;https://regex101.com/r/PmwS2C/1&lt;/A&gt;&lt;BR /&gt;3. &lt;A href="https://regex101.com/r/SBMRme/1" target="_blank"&gt;https://regex101.com/r/SBMRme/1&lt;/A&gt;&lt;/P&gt;&lt;P&gt;- first regex, I have provided sample of 3 events, ( EntityValue, Name, Ids, anything in json format comes)&lt;BR /&gt;- thrid regex, sed works on _raw but it should work only between Data dictionary value. Example see (\"Comments\": \"New alert\", ) is also changed, nothing else should be formated.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 13:38:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691363#M114996</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-23T13:38:00Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691367#M114997</link>
      <description>&lt;P&gt;1. Actual Data looks like below. Data in string format " { } "&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Actual json data.png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/31425iED0BAB9139FBA83C/image-size/large?v=v2&amp;amp;px=999" role="button" title="Actual json data.png" alt="Actual json data.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2. From UI using the below worked to some extent. Data string to list [ { } ]&lt;/SPAN&gt;&lt;STRONG&gt;&lt;SPAN&gt;&lt;BR /&gt;| rex mode=sed "s/(\"Data\":\s+)\"/\1[/g s/(\"Data\":\s+\[{.*})\"/\1]/g s/\\\\\"/\"/g"&lt;BR /&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN&gt;Issue now is it is not automatically identifying the key value pairs inside the Data Dictionary, irrespective of the setting kv_mode =json.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="working but automatic kv isnot getting detected..png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/31426iB45D11BF13C411B9/image-size/large?v=v2&amp;amp;px=999" role="button" title="working but automatic kv isnot getting detected..png" alt="working but automatic kv isnot getting detected..png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 15:12:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691367#M114997</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-23T15:12:06Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691368#M114998</link>
      <description>&lt;PRE&gt;| rex mode=sed "s/(\"Data\":\s+)\"/\1[/g s/(\"Data\":\s+\[{.*})\"/\1]/g s/\\\\\"/\"/g"&lt;BR /&gt;| extract pairdelim="\"{,}" kvdelim=":"&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;Thankyou for your help, the above worked, but I want it to be implemented at index time , not at search time.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 15:22:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691368#M114998</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-23T15:22:18Z</dc:date>
    </item>
    <item>
      <title>Re: IndexTimeExtraction - Regex Substitue only on a specific group - sedcmd (SplunkCloud)-props.conf</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691530#M115019</link>
      <description>&lt;P&gt;In Splunk , sedcmd works on _raw. There is no option to apply it on a specific field.&lt;/P&gt;&lt;P&gt;Temporary solution : When a Field value is passed as string format instead of list in a json file&lt;/P&gt;&lt;P&gt;Search Time extraction :&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex mode=sed "s/(\"Data\":\s+)\"/\1[/g s/(\"Data\":\s+\[{.*})\"/\1]/g s/\\\\\"/\"/g"
| extract pairdelim="\"{,}" kvdelim=":"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Index Time extraction :&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SEDCMD-o365DataJsonRemoveBackSlash = s/(\\)+"/"/g s/(\"Data\":\s+)\"/\1[/g s/(\"Data\":\s+\[{.*})\"/\1]/g&lt;/LI-CODE&gt;</description>
      <pubDate>Tue, 25 Jun 2024 09:33:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/IndexTimeExtraction-Regex-Substitue-only-on-a-specific-group/m-p/691530#M115019</guid>
      <dc:creator>vn_g</dc:creator>
      <dc:date>2024-06-25T09:33:00Z</dc:date>
    </item>
  </channel>
</rss>

