Try to run this command to find which jars contain the above class. And then see if that jar is in the Hadoop classpath
find . -name "*.jar" -exec grep -Hsli com.splunk.mr.SplunkMR {} \;
... View more
We will need to dig into the Hadoop log - specifically the hadoop Attempt log - to see the actual error
Exception - java.io.IOException: Error while waiting for MapReduce job to complete, job_id=job_1525914386605_0005, state=FAILED, reason=Application application_1525914386605_0005 failed 2 times due to AM Container for appattempt_1525914386605_0005_000002 exited with exitCode: -1000
Normally http://Yarn Resource Manager IP: 8088 should take you to the main Hadoop Yarn page
... View more
What version of Splunk are you using?
When you go to ' http://localhost:8088/conf ' you should be able to see all the correct values for the Yarn Resource Manager
yarn.resourcemanager.address and yarn.resourcemanager.scheduler.address
Can you try to run index=test1 | stats count
... View more
Yes Hunk is the older name for Splunk Analytics for Hadoop. They are both licensed the same.
Splunk Analytics for Hadoop is already part of normal Splunk, so you do not need to install any additional Splunk software (you do need Hadoop and Java on the Search Head)
Using Splunk Hadoop Connect will copy the files from HDFS to Splunk indexers. Splunk Analytics for Hadoop will not index the data in Splunk, but will run MR jobs on the Hadoop cluster and will return the results.
... View more
You are correct. A client machine is needed to load the file and you will need the Hadoop libraries to be installed on the client node.
The client node will know how to identifies the Hadoop cluster using the Name Node IP and Port. These days, Task Tracker is not used, so you will need the Yarn Resource Manager IP and Port.
... View more
It looks as if you are using TaskTracker and not Yarn.
Change your settings to Yarn and point to Yarn resource Manager instead of Task Tracker.
You are using Hadoop 2.7, which default to Yarn.
In addition, your path to data in HDFS looks wrong. Normally all you need is /user/username
To test if you have the right location of the file in HDFS, I will recommend for you to try this command from CLI:
hadoop fs -ls maprfs:///user/mapr
... View more
Yes, Hadoop Data Roll will work with the version you are using. The docs should be saying Cloudera 5.*
As you can from the Splunk Analytics for Hadoop page (Hadoop Data Roll is an extension of Splunk Analytics for Hadoop), Cloudera 5.* is supported: http://docs.splunk.com/Documentation/Splunk/latest/HadoopAnalytics/Systemrequirements
... View more
This document can help debug these issues: http://docs.splunk.com/Documentation/Splunk/latest/HadoopAnalytics/TroubleshootSplunkAnalyticsforHadoop
Are you running in Verbose mode?
Are you able to access the Hadoop logs to examine the performance of Hadoop itself?
... View more
In that case, you may want to try one of these options:
1) Modify the Java code of the App itself to include the option to add a path to the certificate
2) Experiment with Splunk DB Connect and see if the MongoDB JDBC driver includes the option to authenticate using X509
3) Remove the requirement for X509 and replace it with one of the other authentication options
... View more
Have you tried to first add the certificate using a mongo shell to mongodb, and only then try to connect using the Splunk App?
See the steps at the bottom of this page:
https://docs.mongodb.com/manual/tutorial/configure-x509-client-authentication/
... View more
Splunk Analytics for Hadoop can analyze many other data types. For example: sequence, avro, parquet, orc, rc, har, and all the text file options (JSON, logs, CSV, TSV, ..) are all supported
... View more
You are correct, when you roll data to Hadoop, the sources is the only available | metadata option
For example, | metadata type=sources index=splunkaccesscombine_archive
Looking at HDFS, I see 4 files and one of them is bucket-metadata.seq, which contains the host and sourcetype. However, I suspect that | metadata does not look inside that file.
... View more
Although, Splunk does not offer an option to copy a virtual index, you can create a new virtual index and point it to the same HDFS path.
Yes, what you are trying to do will work.
... View more
If you have many virtual indexes that require name change, you may what to:
1) find the indexes.conf file that contains all of your virtual indexes configurations (default is /opt/splunk/etc/apps/search/local/indexes.conf )
2) Make a copy of that file (just in case ..)
3) Modify the names of the virtual indexes in the indexes.conf file
4) restart Splunk
... View more
Few ideas:
1) Using Nifi, but with InvokeHttp instead of putSplunk. Here is how to do it: https://www.youtube.com/watch?v=Dq9qKU9HZYM&t=25s
2) Use Hadoop Connect, but mount Hadoop instead of using the normal name node. Here is how to do it: http://docs.splunk.com/Documentation/HadoopConnect/1.2.5/DeployHadoopConnect/Configuretheapp#Map_to_a_mounted_file_system
... View more
I just tried it without any issues.
I tried both index=xyz OR index=abc somekeyword as well as (index=abc somekey=somevalue) OR (index="xyz" somekey=somevalue)
... View more
Also, you can find a link to Docker, VMWare, and VBox images and tutorials from this link:
https://github.com/rdagan/Splunk-Data-Fabric-Integration-Sandbox
... View more
Try these links for Hunk 7.* sandbox
VBox Sandbox = https://splunk.box.com/s/fqe6285ppa8kcnbkvyvf016qped4d693
VBox HTML Tutorial = https://splunk.box.com/s/t0dy1297i7oc89n07bo2rdcpdtb6h934
... View more
With hadoop connect all kerberos flags must be in the files clusters.conf and core-site.xml
When you create a new connection from the UI, Splunk generates these two files.
http://docs.splunk.com/Documentation/HadoopConnect/1.2.5/DeployHadoopConnect/Configurationfilereference
... View more
This link may help you debug the message ' Failed to find any Kerberos tgt '
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_verify_kerb_security_s18.html
... View more
Yes, Normally people install Hadoop Client Binary and Hadoop Client Configs on the Splunk Search Head.
In many cases you can use Cloudera Manager, or you can just install Hadoop using these simple steps: http://hadoop.apache.org/docs/r2.7.4/
In your configurations I see that the Hadoop Home seems wrong. vix.env.HADOOP_HOME = /user/splunkdev looks like you are pointing to Splunk binaries not Hadoop Binaries.
... View more
I would recommend that for at least the Splunk Search Head, you get your Hadoop team to setup a full Hadoop Client environment. That will eliminate many configuration issues.
Once that is done, for all the indexers, you will have the knowledges of the right configurations.
... View more