2014-01-22 80 views
0

什麼程序我是否需要遵循一個 NameNode的數據目錄(dfs.name.dir,dfs.namenode.name.dir)正確添加到現有生產集羣?我已經添加到逗號分隔的列表中選擇新的路徑在HDFS-site.xml文件,但是當我嘗試啓動名稱節點我得到以下錯誤:添加新的NameNode數據目錄添加到現有集羣

Directory /data/nfs/dfs/nn is in an inconsistent state: storage directory does not exist or is not accessible.

在我的情況,我有兩個目錄都已經到位和工作。 (/ data/1/dfs/nn,/ data/2/dfs/nn)當我添加新目錄時,我無法啓動namenode。當新的路徑被刪除時,它開始很好。我的新目錄的fstab如下所示:

backup-server:/hadoop_nn /data/nfs/dfs nfs tcp,soft,intr,timeo=10,retrans=10 1 2

在上面的掛載點中,我創建了一個名爲nn的文件夾。該文件夾具有與其他兩個現有位置nn文件夾相同的所有權和權限。

drwx------ 2 hdfs hadoop 64 Jan 22 16:30 nn

我是否需要將所有從現有的NameNode的目錄或一個文件應該名稱節點服務,一旦自動它開始做手工複製/?

+1

'/ data/nfs/dfs'(您的掛載點)的所有權是否允許'hdfs'用戶進入目錄? – phs

+0

安裝點由root擁有。 (用戶和組)該目錄上的權限爲700.這與其他數據目錄遵循的結構相同。 (/ data/1/dfs由root擁有,權限爲700,而/ data/1/dfs/nn由hdfs/hadoop擁有) –

+0

不夠公平。還有一個愚蠢的問題:'hdfs'用戶和'hadoop'組在NFS服務器和客戶端上有相同的uid(gid)嗎? – phs

回答

3

我相信我可能只是回答了我自己的問題。我最終將其中一個現有namenode目錄的全部內容複製到新的NFS namenode目錄中,並且我能夠啓動namenode。 (請注意,我複製到避免出現問題之前停止的NameNode)

cp -rp /data/1/dfs/nn /data/nfs/dfs/nn 

我想我的假設名稱節點會自動通過複製現有的元數據到新的目錄是不正確的。

0

在Cloudera CDH 4.5.0中,只有當以下函數(從Storage.java,在第418行左右)返回NON_EXISTENT時纔會發生該錯誤。在每種情況下都會顯示警告並提供更多詳細信息,請從org.apache.hadoop.hdfs.server.common.Storage查找日誌行。

總之,名稱節點認爲它不存在,不是一個目錄,不可寫或以其他方式投擲SecurityException

/** 
* Check consistency of the storage directory 
* 
* @param startOpt a startup option. 
* 
* @return state {@link StorageState} of the storage directory 
* @throws InconsistentFSStateException if directory state is not 
* consistent and cannot be recovered. 
* @throws IOException 
*/ 
public StorageState analyzeStorage(StartupOption startOpt, Storage storage) 
    throws IOException { 
    assert root != null : "root is null"; 
    String rootPath = root.getCanonicalPath(); 
    try { // check that storage exists 
    if (!root.exists()) { 
     // storage directory does not exist 
     if (startOpt != StartupOption.FORMAT) { 
     LOG.warn("Storage directory " + rootPath + " does not exist"); 
     return StorageState.NON_EXISTENT; 
     } 
     LOG.info(rootPath + " does not exist. Creating ..."); 
     if (!root.mkdirs()) 
     throw new IOException("Cannot create directory " + rootPath); 
    } 
    // or is inaccessible 
    if (!root.isDirectory()) { 
     LOG.warn(rootPath + "is not a directory"); 
     return StorageState.NON_EXISTENT; 
    } 
    if (!root.canWrite()) { 
     LOG.warn("Cannot access storage directory " + rootPath); 
     return StorageState.NON_EXISTENT; 
    } 
    } catch(SecurityException ex) { 
    LOG.warn("Cannot access storage directory " + rootPath, ex); 
    return StorageState.NON_EXISTENT; 
    } 

    this.lock(); // lock storage if it exists 

    if (startOpt == HdfsServerConstants.StartupOption.FORMAT) 
    return StorageState.NOT_FORMATTED; 

    if (startOpt != HdfsServerConstants.StartupOption.IMPORT) { 
    storage.checkOldLayoutStorage(this); 
    } 

    // check whether current directory is valid 
    File versionFile = getVersionFile(); 
    boolean hasCurrent = versionFile.exists(); 

    // check which directories exist 
    boolean hasPrevious = getPreviousDir().exists(); 
    boolean hasPreviousTmp = getPreviousTmp().exists(); 
    boolean hasRemovedTmp = getRemovedTmp().exists(); 
    boolean hasFinalizedTmp = getFinalizedTmp().exists(); 
    boolean hasCheckpointTmp = getLastCheckpointTmp().exists(); 

    if (!(hasPreviousTmp || hasRemovedTmp 
     || hasFinalizedTmp || hasCheckpointTmp)) { 
    // no temp dirs - no recovery 
    if (hasCurrent) 
     return StorageState.NORMAL; 
    if (hasPrevious) 
     throw new InconsistentFSStateException(root, 
          "version file in current directory is missing."); 
    return StorageState.NOT_FORMATTED; 
    } 

    if ((hasPreviousTmp?1:0) + (hasRemovedTmp?1:0) 
     + (hasFinalizedTmp?1:0) + (hasCheckpointTmp?1:0) > 1) 
    // more than one temp dirs 
    throw new InconsistentFSStateException(root, 
              "too many temporary directories."); 

    // # of temp dirs == 1 should either recover or complete a transition 
    if (hasCheckpointTmp) { 
    return hasCurrent ? StorageState.COMPLETE_CHECKPOINT 
         : StorageState.RECOVER_CHECKPOINT; 
    } 

    if (hasFinalizedTmp) { 
    if (hasPrevious) 
     throw new InconsistentFSStateException(root, 
              STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_FINALIZED 
              + "cannot exist together."); 
    return StorageState.COMPLETE_FINALIZE; 
    } 

    if (hasPreviousTmp) { 
    if (hasPrevious) 
     throw new InconsistentFSStateException(root, 
              STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_PREVIOUS 
              + " cannot exist together."); 
    if (hasCurrent) 
     return StorageState.COMPLETE_UPGRADE; 
    return StorageState.RECOVER_UPGRADE; 
    } 

    assert hasRemovedTmp : "hasRemovedTmp must be true"; 
    if (!(hasCurrent^hasPrevious)) 
    throw new InconsistentFSStateException(root, 
              "one and only one directory " + STORAGE_DIR_CURRENT 
              + " or " + STORAGE_DIR_PREVIOUS 
              + " must be present when " + STORAGE_TMP_REMOVED 
              + " exists."); 
    if (hasCurrent) 
    return StorageState.COMPLETE_ROLLBACK; 
    return StorageState.RECOVER_ROLLBACK; 
} 
+0

看到我對這個問題的回答。我的猜測是,由於新路徑是空的,它看起來名稱節點不存在。我認爲當添加它時會有某種自動複製到新目錄,但我想我錯了。 –

相關問題