<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI? in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249293#M28449</link>
    <description>&lt;P&gt;hI@ddrillic,&lt;/P&gt;

&lt;P&gt;When you say "&lt;STRONG&gt;Splunk Analystics for Hadoop server&lt;/STRONG&gt;" is it refering to the splunk instance(Search head) that is used to interact with the HDFS?&lt;/P&gt;</description>
    <pubDate>Tue, 29 Nov 2016 09:12:18 GMT</pubDate>
    <dc:creator>Harishma</dc:creator>
    <dc:date>2016-11-29T09:12:18Z</dc:date>
    <item>
      <title>Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249291#M28447</link>
      <description>&lt;P&gt;hI Team,&lt;/P&gt;

&lt;P&gt;I'm trying to set up Splunk Analytics for Hadoop in my DEV environment. I'm setting up Hadoop cluster in one server and Splunk Search Head in another server.&lt;BR /&gt;
I understand the basics in Hadoop so I'm learning further by working on this POC.&lt;/P&gt;

&lt;P&gt;In the docs I came across, I need to set up Hadoop CLI on the Splunk instance. &lt;BR /&gt;
What is this? Sorry, I read the doc but couldn't understand much. If someone could elaborate me on this, it would be great.&lt;/P&gt;

&lt;P&gt;In docs at certain places it states the below:&lt;BR /&gt;
"Download and extract the correct Hadoop CLI for each Hadoop cluster"&lt;BR /&gt;
"test that your Hadoop CLI is set up properly and can connect to your Hadoop cluster"&lt;/P&gt;

&lt;P&gt;I'm quite confused, what is this Hadoop CLI? Please guide.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Nov 2016 10:52:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249291#M28447</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2016-11-28T10:52:17Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249292#M28448</link>
      <description>&lt;P&gt;Hi Harishma,&lt;/P&gt;

&lt;P&gt;CLI is a command-line interface or command language interpreter. &lt;/P&gt;

&lt;P&gt;From the &lt;STRONG&gt;Splunk Analystics for Hadoop server&lt;/STRONG&gt; you need to be able to connect to HDFS and Hive via the CLI Hadoop commands. With Hadoop MapR we achieve it by installing the MapR client on the &lt;STRONG&gt;Splunk Analystics for Hadoop server&lt;/STRONG&gt; .&lt;/P&gt;

&lt;P&gt;I hope it helps...&lt;/P&gt;</description>
      <pubDate>Mon, 28 Nov 2016 14:44:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249292#M28448</guid>
      <dc:creator>ddrillic</dc:creator>
      <dc:date>2016-11-28T14:44:02Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249293#M28449</link>
      <description>&lt;P&gt;hI@ddrillic,&lt;/P&gt;

&lt;P&gt;When you say "&lt;STRONG&gt;Splunk Analystics for Hadoop server&lt;/STRONG&gt;" is it refering to the splunk instance(Search head) that is used to interact with the HDFS?&lt;/P&gt;</description>
      <pubDate>Tue, 29 Nov 2016 09:12:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249293#M28449</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2016-11-29T09:12:18Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249294#M28450</link>
      <description>&lt;P&gt;Correct, the Hadoop client directory needs to be present on the Search Head that is talking to Hadoop. This directory will contain both the executables for thing like the CLI, and the jar files (Java libraries) used to connect programmatically to Hadoop. &lt;/P&gt;

&lt;P&gt;BTW, the reason that you need to provide these to Splunk Analytics for Hadoop, as opposed to them being provided with Splunk, is that the libraries need to match your Hadoop distribution, i.e. the vendor and version number of Hadoop that you are using.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Nov 2016 18:24:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249294#M28450</guid>
      <dc:creator>kschon_splunk</dc:creator>
      <dc:date>2016-11-29T18:24:27Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249295#M28451</link>
      <description>&lt;P&gt;Hi @kschon , @ddrillic ,&lt;/P&gt;

&lt;P&gt;I set up the Hadoop YARN CLI in my server &lt;STRONG&gt;ABC - (Splunk Analystics for Hadoop server)&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;I ran the below command on server ABC to test my connection with my Hadoop cluster server - sl55selappn.tesco.com.&lt;/P&gt;

&lt;P&gt;$HADOOP_HOME/bin/hadoop fs -ls hdfs://sl55selappn.tesco.com:9000&lt;/P&gt;

&lt;P&gt;I got the below error. Where am I going wrong? Could you please guide?&lt;/P&gt;

&lt;P&gt;16/12/02 04:41:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable&lt;BR /&gt;
ls: `hdfs://sl55selappn.tesco.com:9000': No such file or directory&lt;/P&gt;</description>
      <pubDate>Fri, 02 Dec 2016 10:09:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249295#M28451</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2016-12-02T10:09:21Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249296#M28452</link>
      <description>&lt;P&gt;The problem is that you have specified the name-node, but not a directory. You can append a dir to the end of the name-node, e.g. to list the contents of  "/foo" you could use:&lt;BR /&gt;
$HADOOP_HOME/bin/hadoop fs -ls hdfs://sl55selappn.tesco.com:9000/foo&lt;/P&gt;

&lt;P&gt;To make this a little easier to read, you can list the name-node separately with the "-fs" option, like so:&lt;BR /&gt;
$HADOOP_HOME/bin/hadoop fs -fs hdfs://sl55selappn.tesco.com:9000 -ls /foo&lt;/P&gt;

&lt;P&gt;To list the contents of the root dir, try this:&lt;BR /&gt;
$HADOOP_HOME/bin/hadoop fs -fs hdfs://sl55selappn.tesco.com:9000 -ls /&lt;/P&gt;</description>
      <pubDate>Sat, 03 Dec 2016 00:44:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249296#M28452</guid>
      <dc:creator>kschon_splunk</dc:creator>
      <dc:date>2016-12-03T00:44:23Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249297#M28453</link>
      <description>&lt;P&gt;Hi @kschon ,&lt;/P&gt;

&lt;P&gt;Yup that worked !! THanks a lot &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;But I'm sorry I'm kinda bak to my original doubt. Sorry if it sounds lame.&lt;BR /&gt;
My Splunk Analytics for Hadoop server in which I have the YARN CLI installed. Does it mean this server is like a single node Hadoop cluster where all the namenodes , datanodes , tasktrackers ..etc exists in this same server??&lt;BR /&gt;
What I'm trying to understand is the folder under this server, $HADOOP_HOME/etc/hadoop -&amp;gt; The env xml files under these should reference my &lt;STRONG&gt;actual Hadoop cluste&lt;/STRONG&gt;r? OR the &lt;STRONG&gt;Splunk analytics for Hadoop server&lt;/STRONG&gt; parameters?&lt;BR /&gt;
Hope I have conveyed my doubt&lt;/P&gt;

&lt;P&gt;I'm not understanding how they are able to communicate. I was able to create a text file in the datanode dir from the splunk analytics to hadoop server.&lt;BR /&gt;
I dont have anything common between these two servers. Both are a different set of Single node cluster and via what is the communication between them happening??&lt;/P&gt;

&lt;P&gt;Please clarify.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Dec 2016 15:04:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249297#M28453</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2016-12-04T15:04:06Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249298#M28454</link>
      <description>&lt;P&gt;If your configuration is correct, your SH should be talking to your real Hadoop cluster, not running in local mode, and not running a local single node cluster. Please read the manual for configuring your version of Hadoop. &lt;/P&gt;

&lt;P&gt;By setting the locations of the file system, resource manager, and scheduler in your XML files, you can control the default locations that your Hadoop client will point to. You can override these on the command line, for example using the "-fs" option for the filesystem. If you are not sure where you are pointing by default, try doing a "fs -ls" with and without specifying the filesystem and see if you get the same thing. If you can't tell, use "fs -put" to put a marker file that you can then look for. &lt;/P&gt;

&lt;P&gt;When you configure a Splunk Analytics for Hadoop provider, you can specify parameters such as:&lt;BR /&gt;
vix.fs.default.name&lt;BR /&gt;
vix.yarn.resourcemanager.address&lt;BR /&gt;
vix.yarn.resourcemanager.scheduler.address &lt;/P&gt;

&lt;P&gt;These will override the values in your XML files.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Dec 2016 21:41:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249298#M28454</guid>
      <dc:creator>kschon_splunk</dc:creator>
      <dc:date>2016-12-05T21:41:30Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249299#M28455</link>
      <description>&lt;P&gt;Ohhhh Yup..!! Got it..!! I realized my mistake..Made SH also run in local mode..!!&lt;/P&gt;

&lt;P&gt;I re-did SH configuration and it worked..!! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; Thankyou so much for your patience in helping me out in this..!! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; It really helped me a lot in understanding this entire flow..!!&lt;/P&gt;

&lt;P&gt;If you really don't mind, my last two doubts in this topic,&lt;BR /&gt;
1) While creating a provider, most of the values were auto-populated. Is there anything that I need to modify here?&lt;BR /&gt;
For example should I modify the below?&lt;BR /&gt;
vix.splunk.home.datanode = /tmp/splunk/$SPLUNK_SERVER_NAME/&lt;BR /&gt;
Should I provide SPLUNK_SERVER_NAME here?&lt;/P&gt;

&lt;P&gt;2) After I created a virtual Index and when I tried to explore data, it give below error in UI:&lt;BR /&gt;
[myhadoopprovider] Error in 'ExternalResultProvider': Hadoop CLI may not be set correctly. Please check HADOOP_HOME and Default Filesystem in the provider settings for this virtual index. Running /home/splunkd1/hadoop-2.7.2/bin/hadoop fs -stat hdfs://ABC.tesco.com:8020/ should return successfully, rc=1, error=16/12/06 08:52:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable stat: Call From XYZ.tesco.com/11.199.169.176 to ABC.tesco.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: &lt;A href="http://wiki.apache.org/hadoop/ConnectionRefused" target="_blank"&gt;http://wiki.apache.org/hadoop/ConnectionRefused&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;ABC --&amp;gt; Hadoop Server&lt;BR /&gt;
XYZ --&amp;gt; Splunk SH&lt;/P&gt;

&lt;P&gt;/home/splunkd1/hadoop-2.7.2/bin/hadoop fs -ls hdfs://ABC.fmrco.com:9000/ --&amp;gt; This workd but why not below command?&lt;BR /&gt;
Should I change port number to 8020 in the xml file in Hadoop cluster?&lt;/P&gt;

&lt;P&gt;/home/splunkd1/hadoop-2.7.2/bin/hadoop fs -ls hdfs://ABC.fmrco.com:8020/&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 12:01:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249299#M28455</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2020-09-29T12:01:47Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249300#M28456</link>
      <description>&lt;P&gt;Very glad we could help. &lt;/P&gt;

&lt;P&gt;As for the default values, most of them should be fine. Change them if you have a specific purpose in mind (e.g. something is not working, or you want to do performance tuning). If you want to know what any of them do, find them here:&lt;BR /&gt;
&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.5.1/Admin/Indexesconf"&gt;http://docs.splunk.com/Documentation/Splunk/6.5.1/Admin/Indexesconf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;As for the error message, it gives you an exact command to run to help you debug:&lt;BR /&gt;
/home/splunkd1/hadoop-2.7.2/bin/hadoop fs -stat hdfs://ABC.tesco.com:8020/&lt;/P&gt;

&lt;P&gt;Try it from the command line. If it does not work, the problem is in your Hadoop configurations and/or network connectivity. If it does work, then the problem is on the Splunk side. It sounds like you configured HDFS to accept connections on port 9000? If so, then your provider configuration needs to match that:&lt;BR /&gt;
vix.fs.default.name = ABC.tesco.com:9000&lt;/P&gt;</description>
      <pubDate>Tue, 06 Dec 2016 20:10:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249300#M28456</guid>
      <dc:creator>kschon_splunk</dc:creator>
      <dc:date>2016-12-06T20:10:31Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249301#M28457</link>
      <description>&lt;P&gt;Yup as you rightly said , I had provided 9000 in my provider settings. Changed it to 9000 and worked..!! Thanks a Ton @kschon !! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Dec 2016 11:15:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249301#M28457</guid>
      <dc:creator>Harishma</dc:creator>
      <dc:date>2016-12-07T11:15:05Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249302#M28458</link>
      <description>&lt;P&gt;Hey @Harishma - Looks like your original question was answered &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; Don't forget to click "Accept" to close out this question and to also up-vote the answer and any comments that were helpful to you. Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2016 20:36:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249302#M28458</guid>
      <dc:creator>aaraneta_splunk</dc:creator>
      <dc:date>2016-12-08T20:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Analytics for Hadoop: What is the difference between Hadoop Cluster and Hadoop CLI?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249303#M28459</link>
      <description>&lt;P&gt;Glad it's working!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2016 22:36:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Splunk-Analytics-for-Hadoop-What-is-the-difference-between/m-p/249303#M28459</guid>
      <dc:creator>kschon_splunk</dc:creator>
      <dc:date>2016-12-08T22:36:15Z</dc:date>
    </item>
  </channel>
</rss>

