<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: file source load balancing in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51446#M9872</link>
    <description>&lt;P&gt;as you guessed, the actual system producing the CDR is outside of our control, so my thinking was to use a UF on a dedicated instance to download the csv files, and then set them up as an input.&lt;/P&gt;

&lt;P&gt;was just curious if there are other ways to tackle this. Is there a way to do this with a single indexer, ie have an indexer forward to itself (or the pool it is in)?&lt;/P&gt;

&lt;P&gt;seems like the sensible idea is a dedicated forwarding node so i can add my scripted inputs also (currently running on an indexer, and thus indexed on only 1 node)&lt;/P&gt;</description>
    <pubDate>Mon, 16 Jan 2012 20:45:30 GMT</pubDate>
    <dc:creator>nickhills</dc:creator>
    <dc:date>2012-01-16T20:45:30Z</dc:date>
    <item>
      <title>file source load balancing</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51444#M9870</link>
      <description>&lt;P&gt;I am just about to start indexing a large amount of CDR (call detail records) which i will be retrieving via SFTP.&lt;/P&gt;

&lt;P&gt;Currently, we splunk our real time data by using forwarders on our servers which load balance into a pool of indexers.&lt;/P&gt;

&lt;P&gt;This is pretty evenly spreading the load across the pool, and also means we can take one of the indexers down for updates etc without affecting our ability to index and report on real time events. &lt;/P&gt;

&lt;P&gt;what is the best way to take 'flat' files and spread the indexing across the pool?&lt;BR /&gt;
i have thought about writing a script to read the files from the server, and then use a forwarder to push the data to the index pool. Is this the best way?&lt;/P&gt;

&lt;P&gt;is there a way to have an indexer retrieve the files, and then push the events back to the pool without indexing them locally first?&lt;/P&gt;

&lt;P&gt;are there other options that i haven't even considered?&lt;/P&gt;

&lt;P&gt;thanks,&lt;BR /&gt;
Nick&lt;/P&gt;</description>
      <pubDate>Sun, 15 Jan 2012 22:45:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51444#M9870</guid>
      <dc:creator>nickhills</dc:creator>
      <dc:date>2012-01-15T22:45:39Z</dc:date>
    </item>
    <item>
      <title>Re: file source load balancing</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51445#M9871</link>
      <description>&lt;P&gt;I'm going to presume that installing a Universal Forwarder(UF) directly on the CDR server is not an option for you.&lt;/P&gt;

&lt;P&gt;So you could have a dedicated/standalone UF that receives the CDR log events and load balances them into your Indexer cluster.&lt;BR /&gt;
The UF wont index the events locally, it will simply just forward them on.&lt;/P&gt;

&lt;P&gt;So a few ideas for getting the CDR log events to the UF :&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;can you syslog(UDP) or stream over
TCP the log events from the CDR
server to the UF ?&lt;/LI&gt;
&lt;LI&gt;your sftp scripted input could
download the files ,write the
contents to STDOUT (which the UF
monitors) and then simply not persist
the file to disk.&lt;/LI&gt;
&lt;LI&gt;the CDR server could write the logs
to shared storage(ie:NAS) which the
UF mounts and monitors.&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 16 Jan 2012 01:12:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51445#M9871</guid>
      <dc:creator>Damien_Dallimor</dc:creator>
      <dc:date>2012-01-16T01:12:58Z</dc:date>
    </item>
    <item>
      <title>Re: file source load balancing</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51446#M9872</link>
      <description>&lt;P&gt;as you guessed, the actual system producing the CDR is outside of our control, so my thinking was to use a UF on a dedicated instance to download the csv files, and then set them up as an input.&lt;/P&gt;

&lt;P&gt;was just curious if there are other ways to tackle this. Is there a way to do this with a single indexer, ie have an indexer forward to itself (or the pool it is in)?&lt;/P&gt;

&lt;P&gt;seems like the sensible idea is a dedicated forwarding node so i can add my scripted inputs also (currently running on an indexer, and thus indexed on only 1 node)&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jan 2012 20:45:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51446#M9872</guid>
      <dc:creator>nickhills</dc:creator>
      <dc:date>2012-01-16T20:45:30Z</dc:date>
    </item>
    <item>
      <title>Re: file source load balancing</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51447#M9873</link>
      <description>&lt;P&gt;In this scenario, the dedicated UF architecture will be the cleanest way of achieving high availability and failover into your cluster of Splunk Indexers.&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jan 2012 03:27:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51447#M9873</guid>
      <dc:creator>Damien_Dallimor</dc:creator>
      <dc:date>2012-01-17T03:27:14Z</dc:date>
    </item>
    <item>
      <title>Re: file source load balancing</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51448#M9874</link>
      <description>&lt;P&gt;This is actually the method we went for.&lt;BR /&gt;
A quick script to grab CDRs by FTP, and then A dedicated forwarder is squirting the events at the indexing pool which gives us some tolerance, and HA. &lt;/P&gt;

&lt;P&gt;thanks for you input!&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2012 21:22:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/file-source-load-balancing/m-p/51448#M9874</guid>
      <dc:creator>nickhills</dc:creator>
      <dc:date>2012-02-27T21:22:56Z</dc:date>
    </item>
  </channel>
</rss>

