<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Mask a particular field in csv data at Index time in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324334#M60385</link>
    <description>&lt;P&gt;Ya, this will work. But, I want to operate on the extracted field rather than on the _raw key. Is there any way I can use an extracted field in my SOURCE_KEY attribute?&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 17:47:21 GMT</pubDate>
    <dc:creator>divyanshukakwan</dc:creator>
    <dc:date>2020-09-29T17:47:21Z</dc:date>
    <item>
      <title>Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324331#M60382</link>
      <description>&lt;P&gt;I have a csv data that contains some sensitive information like client ip. Here is how one of the rows of the data looks:&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;David, London,...several more columns...,192.168.0.1&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;What I want is to mask the IP replacing it with the string "XXXXXXX" so that it produces, for the above row:&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;David, London, ...several more columns..., XXXXXXX&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Also, this operation needs to be performed at index-time.&lt;/P&gt;

&lt;P&gt;I have tried setting up transforms in prop.conf and transform.conf:&lt;/P&gt;

&lt;P&gt;[source::data.csv]&lt;BR /&gt;
TRANSFORMS-masking = pii-mask&lt;/P&gt;

&lt;P&gt;[pii-mask]&lt;BR /&gt;
REGEX = .*&lt;BR /&gt;
FORMAT = ClientIP::XXXXXX&lt;BR /&gt;
SOURCE_KEY = ClientIP&lt;BR /&gt;
DEST_KEY = ClientIP &lt;/P&gt;

&lt;P&gt;However, even after doing this, the IP still comes up. Can anybody tell me how to fix this issue?&lt;/P&gt;

&lt;P&gt;It seems to me that the fields have not been extracted when the transforms are run. If this is the case, how should I get extraction done before transformation?&lt;/P&gt;

&lt;P&gt;Edit:&lt;BR /&gt;
One of the columns in the data is address. This field can contain arbitrary number of commas, for example: "#221, Baker Street, London, England". So, I can't use a simple regular expression in sed. Instead, what I want to know is how to do transforms on extracted field rather than on the _raw field.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 17:46:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324331#M60382</guid>
      <dc:creator>divyanshukakwan</dc:creator>
      <dc:date>2020-09-29T17:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324332#M60383</link>
      <description>&lt;P&gt;Try following simple SED example here: &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_script"&gt;https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_script&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;transforms.conf -&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[ClientIP-anonymizer]
REGEX = (?m)^(.*)ClientIP=\d+\.\d+\.\d+\.\d+(.*)$
FORMAT = $1ClientIP=########$2
DEST_KEY = _raw
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf - &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[source::data.csv]
TRANSFORMS-anonymize = ClientIP-anonymizer
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Hope this helps!&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jan 2018 15:56:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324332#M60383</guid>
      <dc:creator>493669</dc:creator>
      <dc:date>2018-01-23T15:56:39Z</dc:date>
    </item>
    <item>
      <title>Re: Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324333#M60384</link>
      <description>&lt;P&gt;Try this (need the serial no at which the field appears on CSV, I'm assuming 15, adjust accordingly)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[source::data.csv]
SEDCMD-masking = s/^(([^\,]+,){14})(\d+\.\d+\.\d+\.\d+)/\1XX.XX.XX.XX/
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 23 Jan 2018 16:14:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324333#M60384</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2018-01-23T16:14:57Z</dc:date>
    </item>
    <item>
      <title>Re: Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324334#M60385</link>
      <description>&lt;P&gt;Ya, this will work. But, I want to operate on the extracted field rather than on the _raw key. Is there any way I can use an extracted field in my SOURCE_KEY attribute?&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 17:47:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324334#M60385</guid>
      <dc:creator>divyanshukakwan</dc:creator>
      <dc:date>2020-09-29T17:47:21Z</dc:date>
    </item>
    <item>
      <title>Re: Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324335#M60386</link>
      <description>&lt;P&gt;hey,&lt;BR /&gt;
 if you want to do masking on extracted field while displaying then you can use below spl query:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;base search&amp;gt;| replace * WITH XXXXXX IN ClientIP
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 24 Jan 2018 06:36:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324335#M60386</guid>
      <dc:creator>493669</dc:creator>
      <dc:date>2018-01-24T06:36:53Z</dc:date>
    </item>
    <item>
      <title>Re: Mask a particular field in csv data at Index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324336#M60387</link>
      <description>&lt;P&gt;I want to do masking at index time, not search time&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jan 2018 07:00:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Mask-a-particular-field-in-csv-data-at-Index-time/m-p/324336#M60387</guid>
      <dc:creator>divyanshukakwan</dc:creator>
      <dc:date>2018-01-24T07:00:00Z</dc:date>
    </item>
  </channel>
</rss>

