Splunk Cloud Platform

Why do I receive "BlockMissingException - Could not obtain block" error when trying to search Hadoop virtual index?

sbrice
Explorer

I have created my Hadoop provider and configured my virtual index. However, when I go to search my virtual index I am receiving the following error in the Splunk search window.

"[hadoop_hie_hdfs] BlockMissingException - Could not obtain block: BP-1447578430-10.9.104.12-1453857005466:blk_1075773485_2044567 file=/opsanalytics/snow/Dim_Configuration_Item_CM_Approver/000001_0"

Splunk 6.5
Hadoop CLI 2.2.0

rdagan_splunk
Splunk Employee
Splunk Employee

Based on this link, it looks like your Name Node cannot find the blocks:
https://thebipalace.com/2016/05/16/hadoop-error-org-apache-hadoop-hdfs-blockmissingexception-could-n...

0 Karma

sarnagar
Contributor

Hi @rdagan ,

I'm facing the same error. Moreover Im using a single node cluster in DEV and my block does exit .

2016-12-19 02:30:33,798 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-1826813176-10.109.137.83-1480588903531 Total blocks: 21, missing metadata files:0, missing block files:0, missing blocks in memory:0, mismatched blocks:0

But still the below error while running query:

BlockMissingException - Could not obtain block: BP-1826813176-10.109.137.83-1480588903531:blk_1073741844_1020 file=/data/input/splunk/linux2/README.txt

I restarted the nodes also. But I still face the error.

@sbrice - Were you able to fix the error?

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

I assume the data is on your data node, but for some reason your name node cannot access it.
My recommendation is to try to run these from the command line (not using the Splunk UI):
1) the command, hadoop fs -text /data/input/splunk/linux2/README.txt
2) hadoop fs -text hdfs:// your name node : 8020 /data/input/splunk/linux2/README.txt
3) run MapReduce Jobs on this file from the command line, using the Splunk user

0 Karma

sarnagar
Contributor

Hi @rdagan ,

Thankyou so much for guiding me.
Actually None of the commands worked on the SH command line.

But when I tried on the Hadoop cluster, I was getting below WARNINGS.
So I deleted these corrupt blocks and re-added data to datanodes.

But are you aware of why blocks get corrupt in Hadoop. I know this is out of splunk scope, but asking out of curiosity.
16/12/20 02:21:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/20 02:21:17 INFO hdfs.DFSClient: No node available for BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt
16/12/20 02:21:17 INFO hdfs.DFSClient: Could not obtain BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 from any node: java.io.IOException: No live nodes contain block BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
16/12/20 02:21:17 WARN hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 127.55817333292474 msec.
16/12/20 02:21:17 INFO hdfs.DFSClient: No node available for BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt
16/12/20 02:21:17 INFO hdfs.DFSClient: Could not obtain BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 from any node: java.io.IOException: No live nodes contain block BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
16/12/20 02:21:17 WARN hdfs.DFSClient: DFS chooseDataNode: got # 2 IOException, will wait for 7945.315192663103 msec.
16/12/20 02:21:25 INFO hdfs.DFSClient: No node available for BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt
16/12/20 02:21:25 INFO hdfs.DFSClient: Could not obtain BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 from any node: java.io.IOException: No live nodes contain block BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
16/12/20 02:21:25 WARN hdfs.DFSClient: DFS chooseDataNode: got # 3 IOException, will wait for 6954.636509794202 msec.
16/12/20 02:21:32 WARN hdfs.DFSClient: Could not obtain block: BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt No live nodes contain current block Block locations: Dead nodes: . Throwing a BlockMissingException
16/12/20 02:21:32 WARN hdfs.DFSClient: Could not obtain block: BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt No live nodes contain current block Block locations: Dead nodes: . Throwing a BlockMissingException
16/12/20 02:21:32 WARN hdfs.DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1826813176-10.109.137.83-1480588903531:blk_1073741842_1018 file=/data/input/splunk/linux2/LICENSE.txt

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Since none of these Hadoop commands work, I would recommend you add your Hadoop node using a tool like Ambari or Cloudera Manager. Then add the data and HDFS directories using that same Management tools. Ambari and Cloudera Manager are very good at eliminating many issues when creating a Hadoop cluster.
Here is a link that describes some of the reasons for data corruption in HDFS: http://hadoopinrealworld.com/dealing-with-data-corruption-in-hdfs/

sarnagar
Contributor

Hi @rdagan ,

That was really informative. Thankyou so much for your help..!! 🙂

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...