誤差（X，類（k））的：沒有對脅迫「NULL」到「data.frame」

我目前面臨下面提及的誤差，其爲NULL值相關被脅迫的數據幀的方法或者默認。該數據集確實包含空值，但是我曾經嘗試都is.na（）和is.null（）函數用別的東西來代替空值。數據存儲在hdfs中，並以pig.hive格式存儲。我還附上了下面的代碼。如果我從鍵中刪除v [，25]，代碼就可以正常工作。誤差（X，類（k））的：沒有對脅迫「NULL」到「data.frame」

代碼：

AM = c("AN"); 
UK = c("PP"); 
sample.map <- function(k,v){ 
key <- data.frame(acc = v[!which(is.na(v[,1],1], 
        year = substr(v[!which(is.na(v[,1]),2],1,4), 
        month = substr(v[!which(is.na(v[,1]),2],5,6)) 
value <- data.frame(v[,3],count=1) 
keyval(key,value) 
} 

sample.reduce <- function(key,v){ 
    AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2]) 
    UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2]) 
    Total <- AT + UnknownT 
    d <- data.frame(AT,UnknownT,Total) 
    keyval(key,d) 
} 
out <- mapreduce(input ="/user/hduser/input", 
      output = "/user/hduser/output", 
      input.format = make.input.format("pig.hive", sep = "\u0001")        
      output.format = make.output.format("csv", sep = ","), 
      map= sample.map) 
      reduce = sample.reduce)

錯誤：

Warning in asMethod(object) : NAs introduced by coercion 
Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable 
Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in  split.default(1:rmr.length(y), unique(ind), drop = TRUE) : 
data length is not a multiple of split variable 
Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length 
Error in as(x, class(k)) :  
no method or default for coercing 「NULL」 to 「data.frame」 
Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available

UPDATE 我已添加的採樣數據和編輯上面的代碼。希望這可以幫助！

樣本數據：

NULL,"2014-03-14","PP" 
345689202,"2014-03-14","AN" 
234539390,"2014-03-14","PP" 
123125444,"2014-03-14","AN" 
NULL,"2014-03-14","AN" 
901828393,"2014-03-14","AN"

來源

2015-12-09 Satej Wagle

這是不可複製的。請這樣做。 –

Hi Roman，這有幫助嗎？另外我想提到的是，數據存儲在hdfs上，並且此快照是匿名的。但它看起來像這樣。 –

有一些issues與as近來已經確定。我不明白爲什麼as不能用缺省處理這個問題，但可以修改coerce其處理轉換與S4方法調用as.data.frame。

setMethod("coerce",c("NULL","data.frame"), function(from, to, strict=TRUE) as.data.frame(from)) 
[1] "coerce" 
as(NULL,"data.frame") 
data frame with 0 columns and 0 rows

來源

2015-12-09 20:24:42 James

我應該在哪裏運行這段代碼？截至目前，我的hadoop環境包含3個安裝了R和Rmr2軟件包的工作節點。我應該在所有這些節點上運行這個嗎？我也應該每次運行腳本時都運行這個方法？對於提出太多問題抱歉。 –

是的，它需要由每個需要使用該方法的工作人員運行。最好把它放到一個.profile文件中，以便在啓動時運行。 – James

這工作！我將其添加到.profile文件並重新啓動了我的R會話。感謝詹姆斯的及時迴應:) –

誤差（X，類（k））的：沒有對脅迫「NULL」到「data.frame」

回答

相關問題