Splunk Enterprise

How to monitor my hadoop cluster CDP 7.1.3 with Splunk?

cuian01
Observer

Dear All,

I'm very new to Splunk!

In my organization, Splunk Enterprise was deployed and the management want to monitor all the data platforms, applications in Splunk.

Lately, I have deployed Cloudera CDP 7.1.3 in our data center.  Management is expecting Splunk to analyze Hadoop Log files. How to use Splunk to proactively monitor the user activities, service logs and server logs in CDP 7.1.3? Is there any additional component required?

 

Appreciate if you can share your knowledge on it!

 

Thanks

Labels (1)
0 Karma

inventsekar
Super Champion

Hi @cuian01 check these 2 apps please:

https://splunkbase.splunk.com/app/3134/

The Hadoop Monitoring Add-on allows a Splunk software administrator to collect Yarn and Hadoop log files as well as Hadoop nodes OS matrix. The App was tested with Hortonworks, Cloudera, and MapR distributions. After the Splunk platform indexes the events, you can analyze the data by building searches and dashboards. The add-on includes few sample prebuilt dashboard panels and reports.

https://splunkbase.splunk.com/app/1180/

Splunk Hadoop Connect provides bi-directional integration to easily and reliably move data between Splunk and Hadoop.

0 Karma

cuian01
Observer

@inventsekar ,

Thanks for your swift reply!

Actually, I checked the "Hadoop Monitor" app before. But the sample links are all to Hortonworks. With Cloudera & Hortonworks merged together, does "Hadoop Monitor" support latest CDP 7.1.3 release?

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

The Cloudera specific log location should be here:

### Cloudera Yarn Log Files

[monitor:///var/log/hadoop-yarn/*nodemanager*]
sourcetype = hadoop_nodemanager
index = hadoopmon_metrics

[monitor:///var/log/hadoop-yarn/*resourcemanager*]
sourcetype = hadoop_resourcemanager
index = hadoopmon_metrics

[monitor:///var/log/hadoop-yarn/*proxyserver*]
sourcetype = hadoop_proxyserver
index = hadoopmon_metrics

[monitor:///var/log/hadoop-mapreduce/*historyserver*]
sourcetype = hadoop_historyserver
index = hadoopmon_metric

### Cloudera Hadoop Log Files

[monitor:///var/log/hadoop-hdfs/*datanode*]
sourcetype = hadoop_datanode
index = hadoopmon_metrics

[monitor:///var/log/hadoop-hdfs/*namenode*]
sourcetype = hadoop_namenode
index = hadoopmon_metrics

[monitor:///var/log/hadoop-hdfs/*secondarynamenode*]
sourcetype = hadoop_secndarynamenode
index = hadoopmon_metrics

[monitor:///var/log/hadoop-hdfs/*journalnode*]
sourcetype = hadoop_journalnode
index = hadoopmon_metrics


### Cloudera Configuration Files

[monitor:///etc/hadoop/conf/*]
crcSalt = <SOURCE>
disabled = 0
sourcetype = hadoop_global_conf
index = hadoopmon_configs

And after you collect the logs you can run searches similar to these:

[Yarn All Applications]
index=hadoopmon_metrics sourcetype=hadoop_resourcemanager appId=* | eval elapsed_time = finishTime - startTime | table appId name user queue finalStatus elapsed_time

[Yarn Top User]
index=hadoopmon_metrics sourcetype=hadoop_resourcemanager appId=* | top user

[Yarn Success Rate]
index=hadoopmon_metrics sourcetype=hadoop_resourcemanager appId=* | top finalStatus

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!