1

我使用Mahout的k均值算法java.lang.IllegalStateException:未找到任何集羣。檢查-c路徑

mahout kmeans -i /vect_out/tfidf-vectors/ -c /out_canopy -o /out_kmeans -dm 
org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure -cd 1.0 -x 20 -cl 

其中/ out_canopy是包含簇的目錄使用亨利馬烏樹冠集羣包含一個clusters-0目錄中創建在命令行來簇數據使用下面的命令,它本身包含一個名爲_logs目錄和一個名爲part-r-00000

文件,但它使報告以下錯誤

java.lang.IllegalStateException: No clusters found. Check your -c path. 
at org.apache.mahout.clustering.kmeans.KMeansMapper.setup 

回答

0

你確定/out_canopy是目錄嗎?你有沒有試過:

file /out_canopy 

似乎有一個錯字,你想只out_canopy或在某種程度上類似於寫...

+0

out_canopy確實看起來是一個目錄。這就是'hadoop fs -ls /'給出的 'drwxr-xr-x - rupinder supergroup 0 2013-03-07 17:11/out_canopy' – user1976547 2013-03-11 12:37:26

0

這是一個特別棘手的問題。

1. Swallow IllegalStateExceptions thrown by removeShutdownHook in FileSystem. The javadoc states: 

    public boolean removeShutdownHook(Thread hook) 
    Throws: 
    IllegalStateException - If the virtual machine is already in the process of shutting down 

So if we are getting this exception, it MEANS we are already in the process of shutdown, so we CANNOT, try what we may, removeShutdownHook. If Runtime had a method Runtime.isShutdownInProgress(), we could have checked for it before the removeShutdownHook call. As it stands, there is no such method. In my opinion, this would be a good patch regardless of the needs for this JIRA. 

2. Not send SIGTERMs from the NM to the MR-AM in the first place. Rather we should expose a mechanism for the NM to politely tell the AM its no longer needed and should shutdown asap. Even after this, if an admin were to kill the MRAppMaster with a SIGTERM, the JobHistory would be lost defeating the purpose of 3614 
相關問題