<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dealing with a UF client that is sending too much data in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501774#M85486</link>
    <description>&lt;P&gt;Hi @eddpot,&lt;BR /&gt;
In Splunk it's possible to filter logs before indexing.&lt;BR /&gt;
To do this, you should understand which subset of these logs you really need , both in normal and in error condition.&lt;BR /&gt;
Then you have to find a way to identificate them or to identificate the discarding logs using one or more regex and at the end filter the unwanted logs before indexing (for more infos see at &lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.0/Forwarding/Routeandfilterdatad"&gt;https://docs.splunk.com/Documentation/Splunk/8.0.0/Forwarding/Routeandfilterdatad&lt;/A&gt; ).&lt;BR /&gt;
Obviously in this way you are loosing data that you cannot use for debugging or other use cases.&lt;/P&gt;

&lt;P&gt;Ciao.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
    <pubDate>Thu, 05 Dec 2019 12:18:55 GMT</pubDate>
    <dc:creator>gcusello</dc:creator>
    <dc:date>2019-12-05T12:18:55Z</dc:date>
    <item>
      <title>Dealing with a UF client that is sending too much data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501773#M85485</link>
      <description>&lt;P&gt;I have a number of windows clients using the Universal forwarder to send a small log file to Splunk. Typically around 15kb per day per client.&lt;/P&gt;

&lt;P&gt;However, when testing this I found a client that is sending almost 1gb a day rather than the expected 15kb. It appears as though this client is having issues and is writing a massive amount of errors to the log daily.&lt;/P&gt;

&lt;P&gt;If I scale up the deployment of the UF for this app to more clients, then I am concerned that multiple clients having this issue could push my data ingest up to an unsustainable level.&lt;/P&gt;

&lt;P&gt;I need to be able to reduce the amount of data this client (and any future clients that have the same issue) are sending, but I don't want to exclude it entirely as then I won't be able to see which clients are having this manic log writing issue.&lt;/P&gt;

&lt;P&gt;What is the best way to solve this? Can I limit the total data that can be forwarded per client for this app, or can I do some de-duplication on the data prior to forwarding in order to reduce the amount sent? It writes the same log lines repeatedly within the same timestamp&lt;/P&gt;

&lt;P&gt;Thanks for any advice you can offer.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 11:43:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501773#M85485</guid>
      <dc:creator>eddpot</dc:creator>
      <dc:date>2019-12-05T11:43:28Z</dc:date>
    </item>
    <item>
      <title>Re: Dealing with a UF client that is sending too much data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501774#M85486</link>
      <description>&lt;P&gt;Hi @eddpot,&lt;BR /&gt;
In Splunk it's possible to filter logs before indexing.&lt;BR /&gt;
To do this, you should understand which subset of these logs you really need , both in normal and in error condition.&lt;BR /&gt;
Then you have to find a way to identificate them or to identificate the discarding logs using one or more regex and at the end filter the unwanted logs before indexing (for more infos see at &lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.0/Forwarding/Routeandfilterdatad"&gt;https://docs.splunk.com/Documentation/Splunk/8.0.0/Forwarding/Routeandfilterdatad&lt;/A&gt; ).&lt;BR /&gt;
Obviously in this way you are loosing data that you cannot use for debugging or other use cases.&lt;/P&gt;

&lt;P&gt;Ciao.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 12:18:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501774#M85486</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2019-12-05T12:18:55Z</dc:date>
    </item>
    <item>
      <title>Re: Dealing with a UF client that is sending too much data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501775#M85487</link>
      <description>&lt;P&gt;Thanks @gcusello&lt;/P&gt;

&lt;P&gt;That seems like a good solution. Unfortunately I can't come up with a good filter that will reduce the entries significantly, while still leaving enough data in there to be able to identify a faulty client. &lt;/P&gt;

&lt;P&gt;I think if it's not possible to de-duplicate the logs before indexing (which I don't think is possible) then there may not be a good solution available for me.&lt;/P&gt;

&lt;P&gt;However, your reply did answer the question so would it be good form for me to mark your answer as 'Accepted'?&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 15:06:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501775#M85487</guid>
      <dc:creator>eddpot</dc:creator>
      <dc:date>2019-12-05T15:06:09Z</dc:date>
    </item>
    <item>
      <title>Re: Dealing with a UF client that is sending too much data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501776#M85488</link>
      <description>&lt;P&gt;Thank you!&lt;BR /&gt;
Ciao and Next time!&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 15:26:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501776#M85488</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2019-12-05T15:26:13Z</dc:date>
    </item>
    <item>
      <title>Re: Dealing with a UF client that is sending too much data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501777#M85489</link>
      <description>&lt;P&gt;Hi @eddpot,&lt;BR /&gt;
if you could identify these exceptional logs, you could move them in a different index with a low retention time (e.g. one or two days) that you can use for analyze flows and solve problems: search on Splunk answers how to do it, you don't need another question!&lt;BR /&gt;
In this way, you anyway consume license, but not so much storage.&lt;/P&gt;

&lt;P&gt;Ciao.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 15:41:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Dealing-with-a-UF-client-that-is-sending-too-much-data/m-p/501777#M85489</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2019-12-05T15:41:21Z</dc:date>
    </item>
  </channel>
</rss>

