<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Filter Logs in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702896#M116248</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/273531"&gt;@SalahKhattab&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;no, it's the opposite: you have to define only the regex extractions for the fields you want, and the others will not be extracted (always if you didn't defined INDEXED_EXTRACTIONS=XML).&lt;/P&gt;&lt;P&gt;let me know if I can help you more, or, please, accept one answer for the other people of Community.&lt;/P&gt;&lt;P&gt;Ciao and happy splunking&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;&lt;P&gt;P.S.: Karma Points are appreciated &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 27 Oct 2024 09:18:23 GMT</pubDate>
    <dc:creator>gcusello</dc:creator>
    <dc:date>2024-10-27T09:18:23Z</dc:date>
    <item>
      <title>Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702888#M116243</link>
      <description>&lt;P&gt;I have XML input logs in Splunk.&lt;/P&gt;&lt;P&gt;I have already extracted the required fields, totaling 10 fields.&lt;/P&gt;&lt;P&gt;I need to ensure any other fields that are extracted are ignored and not indexed in Splunk.&lt;/P&gt;&lt;P&gt;Can I set it so that if a field is not in the extracted list, it is automatically ignored?&lt;/P&gt;&lt;P&gt;Is this possible?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 08:35:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702888#M116243</guid>
      <dc:creator>SalahKhattab</dc:creator>
      <dc:date>2024-10-27T08:35:32Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702889#M116244</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/273531"&gt;@SalahKhattab&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;unless you extracted your fields at index time, fields are extracted at search time, so all the fields that you configured will be extracted.&lt;/P&gt;&lt;P&gt;I suppose that you extracted the fields using INDEXED_EXTRACTIONS=XML, in this case all the fields you have are extracted at search time and this doesn't consume storage or memory.&lt;/P&gt;&lt;P&gt;It's different is you use regex extractions and not INDEXED_EXTRACTIONS=XML, in this case, only the configured fields are extracted.&lt;/P&gt;&lt;P&gt;Why is so mandatory for you that the other fields aren't extracted?&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 08:40:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702889#M116244</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-10-27T08:40:45Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702890#M116245</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hello Giuseppe,&lt;/P&gt;&lt;P&gt;In my case, the goal is to ensure that the data is cleaned before indexing.&lt;/P&gt;&lt;P&gt;For instance, if the data is:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&amp;lt;test&amp;gt;dasdada&amp;lt;/test&amp;gt;&amp;lt;test2&amp;gt;asdasda&amp;lt;/test2&amp;gt;&lt;/SPAN&gt;&lt;DIV class=""&gt;&lt;P&gt;I only need the data for the &amp;lt;test&amp;gt; field, and I don’t want the &amp;lt;test2&amp;gt; field to appear. Additionally, there are many fields that I don’t require, so creating a regex for each unwanted field to remove it with SEDCMD or a blacklist would be challenging.&lt;/P&gt;&lt;P&gt;Is there a way to delete fields that aren’t extracted from the log before indexing?&lt;/P&gt;&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sun, 27 Oct 2024 08:45:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702890#M116245</guid>
      <dc:creator>SalahKhattab</dc:creator>
      <dc:date>2024-10-27T08:45:34Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702891#M116246</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/273531"&gt;@SalahKhattab&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;if you want to avoid to index a part of data, the job is more complicated because the only way is the approach to anonymize data (&lt;A href="https://docs.splunk.com/Documentation/Splunk/9.3.1/Data/Anonymizedata" target="_blank"&gt;https://docs.splunk.com/Documentation/Splunk/9.3.1/Data/Anonymizedata&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;In other words, you should delete some parts of your logs before indexing.&lt;/P&gt;&lt;P&gt;Why do you want to do this: ro save some license costs or to avoid that some data are visible?&lt;/P&gt;&lt;P&gt;If you don't have one of the above requirements, I hint to index all the data, because the removed data could be useful for you.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 08:50:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702891#M116246</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-10-27T08:50:21Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702894#M116247</link>
      <description>&lt;P&gt;Okay, got it.&lt;/P&gt;&lt;P&gt;One last thing: is there any regex to check if any field not in the extracted list can be ignored from indexing?&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 09:11:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702894#M116247</guid>
      <dc:creator>SalahKhattab</dc:creator>
      <dc:date>2024-10-27T09:11:21Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702896#M116248</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/273531"&gt;@SalahKhattab&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;no, it's the opposite: you have to define only the regex extractions for the fields you want, and the others will not be extracted (always if you didn't defined INDEXED_EXTRACTIONS=XML).&lt;/P&gt;&lt;P&gt;let me know if I can help you more, or, please, accept one answer for the other people of Community.&lt;/P&gt;&lt;P&gt;Ciao and happy splunking&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;&lt;P&gt;P.S.: Karma Points are appreciated &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 09:18:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702896#M116248</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-10-27T09:18:23Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702897#M116249</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;HR /&gt;&lt;P&gt;Sorry, I didn’t quite get your point. Let me clarify.&lt;/P&gt;&lt;P&gt;For example, if this is my data:&amp;nbsp;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;Interceptor&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;AttackCoords&amp;gt;-423423445345345.10742916222947&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;AttackCoords&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;Outcome&amp;gt;2&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;Outcome&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;Infiltrators&amp;gt;20&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;Infiltrators&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;Enforcer&amp;gt;2&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;Enforcer&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;ActionDate&amp;gt;2-04-24&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;ActionDate&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;ActionTime&amp;gt;00:2:00&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;ActionTime&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;RecordNotes&amp;gt;test&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;RecordNotes&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;NumEscaped&amp;gt;0&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;NumEscaped&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;LaunchCoords&amp;gt;-222222&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;LaunchCoords&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;AttackVessel&amp;gt;111&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;AttackVessel&amp;gt; &lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;Interceptor&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to extract only ActionDate and RecordNotes and ignore all other fields during ingestion. This way, the data will be cleared of unnecessary fields. In transforms.conf, I aim to create a regex pattern for ActionDate and RecordNotes to filter out other fields, making the resulting data look like this:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;Interceptor&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;ActionDate&amp;gt;2-04-24&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;ActionDate&amp;gt; &lt;SPAN class=""&gt;&amp;lt;&lt;SPAN class=""&gt;RecordNotes&amp;gt;test&lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;RecordNotes&amp;gt; &lt;SPAN class=""&gt;&amp;lt;/&lt;SPAN class=""&gt;Interceptor&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;How can I achieve this?&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sun, 27 Oct 2024 09:25:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702897#M116249</guid>
      <dc:creator>SalahKhattab</dc:creator>
      <dc:date>2024-10-27T09:25:06Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702900#M116250</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/273531"&gt;@SalahKhattab&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;read the above link for anonymizing, you'll find the use of SEDCMD in props.conf to remove part of your logs:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SEDCOMD_reduce_fields = s/&amp;lt;Interceptor&amp;gt;(.*)\&amp;lt;ActionDate\&amp;gt;2-04-24\&amp;lt;\/ActionDate\&amp;gt;(.*)\&amp;lt;RecordNotes\&amp;gt;test\&amp;lt;\/RecordNotes\&amp;gt;(.*)\&amp;lt;\/Interceptor\&amp;gt;/&amp;lt;Interceptor\&amp;gt;\&amp;lt;ActionDate\&amp;gt;2-04-24\&amp;lt;\/ActionDate\&amp;gt;\&amp;lt;RecordNotes\&amp;gt;test\&amp;lt;\/RecordNotes\&amp;gt;\&amp;lt;\/Interceptor\&amp;gt;/g&lt;/LI-CODE&gt;&lt;P&gt;that you can test at&amp;nbsp;&lt;A href="https://regex101.com/r/fIpO23/1" target="_blank"&gt;https://regex101.com/r/fIpO23/1&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 09:36:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702900#M116250</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2024-10-27T09:36:06Z</dc:date>
    </item>
    <item>
      <title>Re: Filter Logs</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702908#M116253</link>
      <description>&lt;P&gt;Manipulating structured data with regexes is not a very good idea. It would be better to use an external tool to clean up your data before ingesting.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Oct 2024 19:09:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-Logs/m-p/702908#M116253</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2024-10-27T19:09:26Z</dc:date>
    </item>
  </channel>
</rss>

