All Apps and Add-ons

Anyone ever tried using Hunk with Datastax Enterprise?

dcparker
Path Finder

Hello,

I was playing around and trying to set up my Datastax Enterprise Analytics nodes with Hunk.
Info about DSE: DSE Hadoop

I got to creating the index but unfortunately Datastax uses CFS rather than HDFS. I tried setting up the provider as HDFS anyway and that didn't work and when I try CFS I get:

 [test hadoop] RuntimeException - Failed to create a virtual index filesystem connection: No FileSystem for scheme: cfs. Advice: Verify that your vix.fs.default.name is correct and available. 

Using HDFS:

 Failed to create a virtual index filesystem connection: Call to hostname/192.168.31.1:9160 failed on local exception: java.io.EOFException.

Not sure if this is really possible but was curious if anyone else had tried this. Thanks for any help/advice!

Tags (2)
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Thanks for clarifying what's happening. Can you try:

(1) add the following to the provider stanza

vix.fs.cfs.impl = com.datastax.bdp.hadoop.cfs.CassandraFileSystem

(2) use vix.fs.default.name = cfs://cassandrahost/

Ledion_Bitincka
Splunk Employee
Splunk Employee

Hmm, unfortunately we don't log the entire stacktrace so we'd have to guess at this point - maybe http://stackoverflow.com/questions/19534811/cassandra-startup-java-lang-reflect-invocationtargetexce...?

I would recommend that you try to get the Hadoop CLI tools to work with cfs:// filesystem directly (ie no dse hadoop fs ..) and then we can apply those conf changes to Hunk - which is fs insensitive, for example we work with Amazon's s3/n out of the box.

0 Karma

dcparker
Path Finder

Here's search.log
http://pastebin.com/5ucVrKrb

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Can you provide search.log?

0 Karma

dcparker
Path Finder

actually I found it, dse.jar.

I put it in there and see:

[test hadoop] RuntimeException - Failed to create a virtual index filesystem connection: java.lang.reflect.InvocationTargetException. Advice: Verify that your vix.fs.default.name is correct and available.

0 Karma

dcparker
Path Finder

Thanks, I briefly checked but couldn't find any jar with that in it. I will look more closely tomorrow. I might just be able to put the cassandra jars in there, but I doubt it would work since grep didn't find anything.

I did find this too: http://www.datastax.com/support-forums/topic/how-can-we-enable-hdfs-and-cfs-too

It looks like I can make HDFS the default, but that sort of defeats the purpose of having CFS.

Thanks for your help! I will let you know if I find the jar.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Correct - indexes.conf. Now you're running into a classpath issue. Can you try to find the cassandra jar where this class com.datastax.bdp.hadoop.cfs.CassandraFileSystem is defined and then add that jar to the following field in the provider: vix.env.HADOOP_CLASSPATH

Command to list the contents of the jar: unzip -l [jar-file] | grep CassandraFileSystem

0 Karma

dcparker
Path Finder

Ok. Assuming I did this right, here's what it shows now:

[test hadoop] RuntimeException - Failed to create a virtual index filesystem connection: java.lang.ClassNotFoundException: com.datastax.bdp.hadoop.cfs.CassandraFileSystem. Advice: Verify that your vix.fs.default.name is correct and available.

I added
vix.fs.cfs.impl = com.datastax.bdp.hadoop.cfs.CassandraFileSystem

to indexes.conf. Is that correct?

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Can you access CFS from the hadoop CLI? e.g

hadoop fs -ls cfs://....

dcparker
Path Finder

search.log with cfs:
http://pastebin.com/3y2pFd8s

search.log with hdfs:
http://pastebin.com/STUEHxnV

0 Karma

dcparker
Path Finder

I tried to use CFS in the indexes config but got the " [test hadoop] RuntimeException - Failed to create a virtual index filesystem connection: No FileSystem for scheme: cfs. Advice: Verify that your vix.fs.default.name is correct and available. " That's when I changed to hdfs to see if it worked.

Trying that command returns a: failed on local exception: java.io.EOFException

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Hmmm, so you're not using cfs:// - I wonder where the cfs is coming from. Can you also share the contents of search.log? Also, does "hadoop fs -ls hdfs://hostname:9160/" return anything?

0 Karma

dcparker
Path Finder

Here is indexes.conf
[provider:test hadoop]
vix.env.HADOOP_HOME = /usr/share/dse/hadoop/
vix.env.JAVA_HOME = /usr
vix.family = hadoop
vix.fs.default.name = hdfs://hostname:9160
vix.mapred.job.tracker = hostname:8012
vix.splunk.home.hdfs = /data1/hunk/

[testing]
vix.input.1.path = /...
vix.provider = test hadoop

Here is core-site:
http://pastebin.com/LL69Hn4a

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Can you share the content of indexes.conf and core-site.xml? This is most likely a config issue wrt cfs not being a registered filesystem in the hadoop conf

0 Karma

dcparker
Path Finder

oh - sorry, I misunderstood.

If I go to the dir where the hadoop binary is and try to run it I get:
bash-4.1$ ./hadoop fs -ls cfs://
ls: No FileSystem for scheme: cfs

However, I can do:
./hadoop fs -ls /data1
drwx------ - root root 16384 2014-02-20 17:04 /data1/lost+found
drwxr-xr-x - cassandra cassandra 4096 2014-03-05 18:30 /data1/hunk
drwxrwxr-x - cassandra cassandra 4096 2014-03-05 18:01 /data1/cassandra

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Great! What about calling the CLI directly as in: "hadoop fs -ls cfs://" ?

0 Karma

dcparker
Path Finder

Yes, here's a sample:

bash-4.1$ dse hadoop fs -ls cfs://
Found 2 items
drwxrwxrwx - cassandra cassandra 0 2014-03-05 18:09 /data1
drwxrwxrwx - cassandra cassandra 0 2014-03-05 18:01 /tmp

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...