<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Filtering data out with RegEx based on blacklist works only partially in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537636#M90100</link>
    <description>&lt;P&gt;Hi thanks for your response. The regex matches it, that's my problem.&lt;/P&gt;</description>
    <pubDate>Thu, 28 Jan 2021 14:32:35 GMT</pubDate>
    <dc:creator>kepffr</dc:creator>
    <dc:date>2021-01-28T14:32:35Z</dc:date>
    <item>
      <title>Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537059#M90019</link>
      <description>&lt;P&gt;Hi guys!&lt;/P&gt;&lt;P&gt;I want to filter data out on my forwarder by using Regular Expression in transforms.conf. The strange thing is, that it only works partially but my regex itself is or should be fine.&lt;/P&gt;&lt;P&gt;transforms.conf&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[deleteAdvertisingTracking]
REGEX=(\t)hostname=.*(adnxs|doubleclick|adsafeprotected|pubmatic|xiti|smartadserver|lijit|ads\.yahoo|insurads)\.(com|net)
DEST_KEY = queue
FORMAT = nullQueue

[deleteShopping]
REGEX=(\t)hostname=.*(amazon|ebay)\.(com|net)
DEST_KEY = queue
FORMAT = nullQueue&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;props.conf&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[STANZA_NAME]
TRANSFORMS-DeleteStuff = deleteAdvertisingTracking,deleteShopping&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The second Stanza named "deleteShopping" works just fine but not the first.&amp;nbsp;I could observe that it stopped working with 3 or more substrings (e.g&amp;nbsp;adnxs|doubleclick|adsafeprotected|pubmatic)&amp;nbsp; in the regex. I've tried adding "LOOKAHEAD = 65535" but that didn't help.&lt;/P&gt;&lt;P&gt;Of course I restarted the forwarder after the changes. Do you have any idea what's going wrong? I'm using Splunk v8.1.1.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2021 14:36:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537059#M90019</guid>
      <dc:creator>kepffr</dc:creator>
      <dc:date>2021-01-25T14:36:07Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537447#M90074</link>
      <description>&lt;P&gt;Hi again. I've tried more things but nothing worked. I think my assumption that it works with less substrings is wrong, I cannot reproduce that anymore. Things I've tried:&lt;/P&gt;&lt;P&gt;- Set MATCH_LIMIT to a higher value&lt;/P&gt;&lt;P&gt;- Replaced the sourcetype in the stanza specified in props.conf with source::tcp:1422&lt;/P&gt;&lt;P&gt;- Included a named group in the regex, e.g:&lt;/P&gt;&lt;P&gt;REGEX=(\t)hostname=(.*\.)?(?&amp;lt;site&amp;gt;amazon|ebay)\.(de|com|net)&lt;/P&gt;&lt;P&gt;Data to test:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2021-01-27 16:55:42	action=Allowed	event_id=6922469125184356358	protocol=SSL	category=Corporate Marketing	dest=1.1.1.1	http_referrer=None	http_user_agent=XXXXXXX	clientpublicIP=1.1.1	status=NA	user=something.something@something.com	url=ebayimg.ebay.com	hostname=ebayimg.ebay.com	clientIP=1.1.1	threatcategory=None	threatname=None	appname=XXX	pagerisk=0	department=XXXX	supercategory=XXXX	appclass=File Share	urlclass=XXXXX	threatclass=None
	bytes_out=2272	bytes_in=5100&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;props.conf&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[XXXXXX]
DATETIME_CONFIG = CURRENT
TZ = UTC
[source::tcp:1422]
TRANSFORMS-DeleteStuff = deleteAdvertisingTracking,deleteShopping&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;transforms.conf&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[deleteAdvertisingTracking]
REGEX=(\t)hostname=.*(adnxs|doubleclick|adsafeprotected|pubmatic|xiti|smartadserver|lijit|ads\.yahoo|insurads)\.(com|net)
DEST_KEY = queue
FORMAT = nullQueue

[deleteShopping]
REGEX=(\t)hostname=(.*\.)?(amazon|ebay)\.(de|com|net|cn)
DEST_KEY = queue
FORMAT = nullQueue&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is what the command "cmd btool transforms list deleteAds" says:&lt;/P&gt;&lt;P&gt;CAN_OPTIMIZE = True&lt;BR /&gt;CLEAN_KEYS = True&lt;BR /&gt;DEFAULT_VALUE =&lt;BR /&gt;DEPTH_LIMIT = 1000&lt;BR /&gt;DEST_KEY = queue&lt;BR /&gt;FORMAT = nullQueue&lt;BR /&gt;KEEP_EMPTY_VALS = False&lt;BR /&gt;LOOKAHEAD = 4096&lt;BR /&gt;MATCH_LIMIT = 100000&lt;BR /&gt;MV_ADD = False&lt;BR /&gt;REGEX = (\t)hostname=(.*\.)?(amazon|ebay)\.(de|com|net|cn)&lt;BR /&gt;SOURCE_KEY = _raw&lt;BR /&gt;WRITE_META = False&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jan 2021 16:38:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537447#M90074</guid>
      <dc:creator>kepffr</dc:creator>
      <dc:date>2021-01-27T16:38:03Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537468#M90079</link>
      <description>&lt;P&gt;Is stanza name&amp;nbsp;&lt;STRONG&gt;deleteAdvertisingTracking&lt;/STRONG&gt; OR &lt;STRONG&gt;deleteAds&lt;/STRONG&gt; in transformas.conf?&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jan 2021 17:56:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537468#M90079</guid>
      <dc:creator>manjunathmeti</dc:creator>
      <dc:date>2021-01-27T17:56:53Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537582#M90097</link>
      <description>&lt;P&gt;Both, I want to execute both stanzas. Is that not possible?&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jan 2021 08:26:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537582#M90097</guid>
      <dc:creator>kepffr</dc:creator>
      <dc:date>2021-01-28T08:26:37Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537609#M90098</link>
      <description>&lt;P&gt;Yes, it is possible. Make sure that regex of any one of them is matching to the logs you want to send to the null queue.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jan 2021 12:25:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537609#M90098</guid>
      <dc:creator>manjunathmeti</dc:creator>
      <dc:date>2021-01-28T12:25:57Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537636#M90100</link>
      <description>&lt;P&gt;Hi thanks for your response. The regex matches it, that's my problem.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jan 2021 14:32:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537636#M90100</guid>
      <dc:creator>kepffr</dc:creator>
      <dc:date>2021-01-28T14:32:35Z</dc:date>
    </item>
    <item>
      <title>Re: Filtering data out with RegEx based on blacklist works only partially</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537779#M90121</link>
      <description>&lt;P&gt;Try with the below configurations without "\".&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[deleteAdvertisingTracking]
REGEX hostname = .*(adnxs|doubleclick|adsafeprotected|pubmatic|xiti|smartadserver|lijit|ads.yahoo|insurads).(com|net)
DEST_KEY = queue
FORMAT = nullQueue

[deleteShopping]
REGEX = hostname=.*(amazon|ebay).(de|com|net|cn)
DEST_KEY = queue
FORMAT = nullQueue&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jan 2021 04:01:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filtering-data-out-with-RegEx-based-on-blacklist-works-only/m-p/537779#M90121</guid>
      <dc:creator>manjunathmeti</dc:creator>
      <dc:date>2021-01-29T04:01:35Z</dc:date>
    </item>
  </channel>
</rss>

