Hortonworks HA Namenodes gives error "READ operation category not supported in idle state"

My hadoop cluster HA active namenode (host1) suddenly switches to the backup namenode (host2). I could not find the error in the hadoop logs (on any server) to determine the root cause.

After switching Namenodes, the next error that occurs in hdfs logs often, and not the application, can read HDFS files.

2014-07-17 01: 58: 53,381 WARN namenode.FSNamesystem (FSNamesystem.java:getCorruptFiles (6769)) - Get corrupted file blocks returned error: READ operation category is not supported in standby state

As soon as I restart the new active node (host2), namenode switches back to the new standby mode node (host1). Then the cluster is working fine, users can also extract HDFS files.

I am using Hortonworks 2.1.2.0 and HDFS version 2.4.0.2.1

Edit: 21st Jult 2014 The following logs were found in the active logs of the item list when the transition to the named standby mode occurred /

NT_SETTINGS-1675610.csv dst = null perm = null 2014-07-20 09: 06: 44,746 INFO FSNamesystem.audit(FSNamesystem.java:logAuditMessage(7755)) - allowed = true
ugi = storm (auth: SIMPLE) ip =/10.0.1.50 cmd = getfileinfo src=////LEAPSET//138018- 6.csv dst = null perm = null 2014-07-20 09: 06: 44,747 INFO FSNamesystem.audit(FSNamesystem.java:logAuditMessage(7755)) - allowed = true ugi = storm (auth: SIMPLE) ip =/10.0.1.50
CMD = GetFileInfo src=////LEAPSET/MERCHANT_SETTINGS/MERCHA NT_SETTINGS-1695794.csv dst = null perm = null 2014-07-20 09: 06: 44,747 INFO FSNamesystem.audit(FSNamesystem.java:logAuditMessage(7755)) - allowed = true
ugi = storm (auth: SIMPLE) ip =/10.0.1.50 cmd = getfileinfo src=////LEAPSET//139954- 1.csv dst = null perm = null 2014-07-20 09: 06: 44,748 INFO namenode.FSNamesystem(FSNamesystem.java:stopActiveServices(1095)) - 2014-07-20 09: 06: 44 750 INFO namenode.FSEditLog(FSEditLog.java:endCurrentLogSegment(1153)) - 842249 2014-07-20 09: 06: 44 752 INFO namenode.FSEditLog(FSEditLog.java:printStatistics(673)) - : 2 (): 0 , Syncs: 0 : 1 SyncTimes (): 4 35 2014-07-20 09: 06: 44,774 INFO namenode.FSEditLog(FSEditLog.java:printStatistics(673)) - : 2 (): 0 , : 0 : 2 SyncTimes (): 24 37 2014-07-20 09: 06: 44,805 INFO namenode.FSNamesystem(FSNamesystem.java:run(4362)) - NameNodeEditLogRoller , 2014-07-20 09: 06: 44,824 INFO namenode.FileJournalManager(FileJournalManager.java:finalizeLogSegment(130)) - //Hadoop/HDFS/NameNode//edits_inprogress_0000000000000842249 → /ebs/hadoop/hdfs/name node/current/edits_0000000000000842249-0000000000000842250 2014-07-20 09: 06: 44,874 INFO blockmanagement.CacheReplicationMonitor(CacheReplicationMonitor.java:run(168)) - CacheReplicationMonitor 2014-07-20 09: 06: 44 876 INFO namenode.FSNamesystem(FSNamesystem.java:startStandbyServices(1136)) - , 2014-07-20 09: 06: 44,927 INFO ha.EditLogTailer(EditLogTailer.java:(117)) - node hadoop-client-us-west-1b/10.0.254.10: 8020 120 . 2014-07-20 09: 06: 44,929 INFO ha.StandbyCheckpointer(StandbyCheckpointer.java:start(129)) - ... NN http://hadoop-client-us-west-1b: 50070 http://hadoop-client-us-west-1a: 50070 2014-07-20 09: 06: 44,930 INFO ipc.Server(Server.java:run(2027)) - IPC 3 8020, org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo 10.0.1.50:57297 # 8431877 # 0: org.apache.hadoop.ipc.StandbyException: READ 2014-07-20 09: 06: 44,930 INFO ipc.Server(Server.java:run(2027)) - IPC 16 8020, org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo 10.0.1.50:57294 # 130105071 # 0: org.apache.hadoop.ipc.StandbyException: READ 2014-07-20 09: 06: 44,940 INFO ipc.Server(Server.java:run(2027)) - IPC 14 8020, org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo 10.0.1.50:57294 Call # 130105072 Retry # 0: org.apache.hadoop.ipc.StandbyException: READ

: 13 2014 namenode, namenode , .

READ .

: 7 2014 , , solution namenode, namenode. namenodes HA node.

+4
1

. . amabari . SPARK_HOME.

0

Source: https://habr.com/ru/post/1548717/


All Articles