<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to improve forwarding performance when importing data from Hadoop? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-improve-forwarding-performance-when-importing-data-from/m-p/352205#M64554</link>
    <description>&lt;P&gt;Hi all, we have a big problem with our forwarder.&lt;BR /&gt;
We need to be able to index about 600GB/day and we have 10 indexers, 1 forwarder, and as of now we index about 260GB/day. Our license allows us to index this many.&lt;BR /&gt;
We have two problems:&lt;BR /&gt;
1. Because we import our data from hadoop, we cant have many forwarders, because they will monitor the same directories. However, we logically split our data into two groups, and we are trying to add another forwarder, but he wont connect to the indexers. We copied and renamed the deployment app and created a serverclass in the deployment server for the new forwarder, and now aside from the different inputs, both the new and the old forwarder have the same configuration, but the new one refuses to connect.&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;We just cant troubleshoot the problematic part. We tried using the monitoring console, and even searching the logs directly, and there are no full queues. Our max kbPerSec is 800mb which is more than enough. I believe the problem lies in our forwarder because we see the slow pace in him, its not like he's forwarding fast enough and the indexers are the problem. At the forwarder we are extracting timestamp from field, a field using regex, and index name with transforms.
What we would like is to find the bottleneck so we can index as fast as we need.&lt;/LI&gt;
&lt;/OL&gt;</description>
    <pubDate>Wed, 31 Jan 2018 22:19:03 GMT</pubDate>
    <dc:creator>eylonronen</dc:creator>
    <dc:date>2018-01-31T22:19:03Z</dc:date>
    <item>
      <title>How to improve forwarding performance when importing data from Hadoop?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-improve-forwarding-performance-when-importing-data-from/m-p/352205#M64554</link>
      <description>&lt;P&gt;Hi all, we have a big problem with our forwarder.&lt;BR /&gt;
We need to be able to index about 600GB/day and we have 10 indexers, 1 forwarder, and as of now we index about 260GB/day. Our license allows us to index this many.&lt;BR /&gt;
We have two problems:&lt;BR /&gt;
1. Because we import our data from hadoop, we cant have many forwarders, because they will monitor the same directories. However, we logically split our data into two groups, and we are trying to add another forwarder, but he wont connect to the indexers. We copied and renamed the deployment app and created a serverclass in the deployment server for the new forwarder, and now aside from the different inputs, both the new and the old forwarder have the same configuration, but the new one refuses to connect.&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;We just cant troubleshoot the problematic part. We tried using the monitoring console, and even searching the logs directly, and there are no full queues. Our max kbPerSec is 800mb which is more than enough. I believe the problem lies in our forwarder because we see the slow pace in him, its not like he's forwarding fast enough and the indexers are the problem. At the forwarder we are extracting timestamp from field, a field using regex, and index name with transforms.
What we would like is to find the bottleneck so we can index as fast as we need.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Wed, 31 Jan 2018 22:19:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-improve-forwarding-performance-when-importing-data-from/m-p/352205#M64554</guid>
      <dc:creator>eylonronen</dc:creator>
      <dc:date>2018-01-31T22:19:03Z</dc:date>
    </item>
  </channel>
</rss>

