Wednesday, June 18, 2014

Issues & Solution : HDFS High Availability for Hadoop 2.X

Issues & Solution : HDFS High Availability for Hadoop 2.X
Here I have discussed few error / issues during the Hadoop HA setup.

Error 1)
when I start resourcemanager from active namenode in Hadoop HA ,
root@master:/opt/hadoop-2.2.0# sbin/yarn-daemon.sh start resourcemanager

Problem binding to [master:8025] java.net.BindException: Cannot assign requested address;
Solution
Check your /etc/hosts file, If you have multiple enrty for same IP/localhost, delete and make sure only one valued entry.
Just I removed all other entry for the IP ' 10.184.39.167' from /etc/hosts

10.184.39.167 standby


Error 2)
Once I have configured Haddop HA, strated Haddop cluster
root@master[bin]#hdfs namenode -format

FATAL namenode.NameNode:Exception in namenode join

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:

10.184.39.67:8485: Call From standby/10.184.39.62 to master:8485 failed on connection exception: java.net.ConnectException:
Solution
start all the configured journalnode then format the Hadoop namenode
root@master[bin]#../sbin/hadoop-daemon.sh start journalnode
root@master[bin]#hadoop namenode -format


Error 3)
As mentioned in Solution 2 above, journalnode started and hadoop namenode formatted successfully
when I try to start DFS
root@master[bin]#../sbin/start-dfs.sh

java.io.IOException: Cannot start an HA namenode with name dirs that need recovery. Dir: Storage Directory /app/hadoop2/namenode state: NOT_FORMATTED
Solution
  Error due to data sync. b/w active node and standby node, If your distributed Hadoop setup is fresh you may not get this error.
I copied dfs directory from the active namenode to the standby namenode
root@standby[hadoop-2.2.0]#scp -r /app/hadoop2/namenode/* root@master:/app/hadoop2/namenode/
make sure full permission to hadoop.dir
root@master[bin]#chmod 777 -R /app/hadoop2/
re started it.
root@master[bin]#../sbin/start-dfs.sh


Error 4)
Once I have configured Haddop HA, strated Haddop cluster
root@master[bin]#hdfs namenode -format

FATAL namenode.NameNode:Exception in namenode join

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:

10.184.39.67:8485: Call From standby/10.184.39.62 to master:8485 failed on connection exception: java.net.ConnectException:
Solution
start all the configured journalnode then format the Hadoop namenode
root@master[bin]#../sbin/hadoop-daemon.sh start journalnode
root@master[bin]#hadoop namenode -format


Error 5)
NameNode doesn't start in Hadoop2.x
root@master[bin]#../sbin/start-dfs.sh

Incorrect configuration namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. Starting namenodes on [] …. …..
Solution
This error caused due to *.site.xml tag problem, I'v checked my *site.xml file, all seems to be correct. But mistakenly I have commended fs.default.name in core-site.xml.
 
<property>
   <name>fs.default.name</name>
         <value>hdfs://master:9000</value>
</property>



Related posts

Error and Solution - Hadoop HA Automatic Failover
distributed Hadoop setup
Issue while setup Hadoop cluster

3 comments:

mareddyonline said...

I like the helpful hadoop information you provide for your tutorials. I’ll bookmark your weblog and check again here frequently. I am quite sure I’ll learn many new stuff proper here! Best of luck for the following!
Hadoop Training in hyderabad

Unknown said...

the last solution was so useful. Thanks !

ಭೀಮಸೇನ said...

hi, very helpful information.

I am getting some different error, like
bind exception for namenode.
Name node and Jobtracker is not starting in master.

Please can you help me.

Thank you.