2015-05-04 25 views
0

我有一個R腳本包含一個函數,我收到了這個問題的答案:R: For loop nested in for loop數據表中的錯誤:項目沒有長度? - R

該腳本在我的數據集的第一部分工作正常,但我現在試圖在另一部分上使用它,就我所知,它具有與第一部分完全相同的格式,但對於某些原因在嘗試使用腳本時出現錯誤。我無法弄清楚,是什麼導致了錯誤。

這是我使用的腳本:

require(data.table) 

MappingTable_Calibrated = read.csv2(file.choose(), header=TRUE) 
head(MappingTable_Calibrated) 

#The data is sorted primarily after Scaffold number in ascending order, and secondarily after Cal_Startgen in ascending order. 

MappingTable_Calibratedord = MappingTable_Calibrated[order(MappingTable_Calibrated$Scaffold, MappingTable_Calibrated$Cal_Startgen),] 
head(MappingTable_Calibratedord) 

dt <- data.table(MappingTable_Calibratedord, key = "Scaffold,Cal_Startgen") 
head(dt) 

# The following function creates pairs of loci for each scaffold. 
# The function is a modified version of a function found retrieved from http://www.stackoverflow.com 

FN =功能(dtIn,ID){

# Creates the object dtHead containing as many lines as in dtIn minus the last line) 

dtHead = head(dtIn, n = nrow(dtIn) - 1)  

# The names of dtHead are appended with _a. paste0 short for: paste(x, sep="") 

setnames(dtHead, paste0(colnames(dtHead), "_a")) 

# Creates the object dtTail containing as many lines as in dtIn minus the first line) 

dtTail = tail(dtIn, n = nrow(dtIn) - 1)  

# The names of dtTail are appended with _b. 

setnames(dtTail, paste0(colnames(dtTail), "_b")) 

# dtHead and dtTail are combined. Scaffold is defined as id. The blank column "Pairwise_Distance is added to the table. 

cbind(dtHead, dtTail, Scaffold = id, Pairwise_Distance = 0) 

} 

#The function is run on the data. .SDcols defines the rows to be included in the output. 

output = dt[, fn(.SD, Scaffold), by = Scaffold, .SDcols = c("Name", "Startpos", "Endpos", "Rev", "Startgen", "Endgen", "Cal_Startgen", "Cal_Endgen", "Length")] 
output = as.data.frame(output[, with = FALSE]) 

但是,試圖打造 「輸出」 我收到以下錯誤時: Error in data.table(..., key = key(..1)) : Item 1 has no length. Provide at least one item (such as NA, NA_integer_etc) to be repeated to match the 2 rows in the longest column. Or, all columns can be 0 length, for insert()ing rows into.

DT看起來是這樣的:

Name   Length Startpos Endpos Scaffold Startgen Endgen Rev Match Cal_Startgen Cal_Endgen 
1: Locus_7173 144  0 144  34 101196 101340 1  1  101196  101340 
2: Locus_133  110  0 110  34 223659 223776 1  1  223659  223776 
3: Locus_2746 161  0  89  65 101415 101504 1  1  101415  101576 

「DT」的完整dput可以在這裏找到:https://www.dropbox.com/sh/3j4i04s2rg6b63h/AADkWG3OcsutTiSsyTl8L2Vda?dl=0

回答

5

開始跟蹤這由導致錯誤的數據:

function(dtIn, id){ 
    dtHead = head(dtIn, n = nrow(dtIn) - 1)  
    setnames(dtHead, paste0(colnames(dtHead), "_a")) 
    dtTail = tail(dtIn, n = nrow(dtIn) - 1)  
    setnames(dtTail, paste0(colnames(dtTail), "_b")) 
    r <- tryCatch(cbind(dtHead, dtTail, Scaffold = id, Pairwise_Distance = 0), error = function(e) NULL) 
    if(is.null(r)) browser() 
    r 
} 

然後你就可以看到你正在試圖不同nrow /長度的cbind元素:

Browse[1]> dtHead 
Empty data.table (0 rows) of 9 cols: Name_a,Startpos_a,Endpos_a,Rev_a,Startgen_a,Endgen_a... 
Browse[1]> dtTail 
Empty data.table (0 rows) of 9 cols: Name_b,Startpos_b,Endpos_b,Rev_b,Startgen_b,Endgen_b... 
Browse[1]> id 
[1] 76 
Browse[1]> 0 
[1] 0 

這是不允許的。
我建議把if(nrow(或類似的東西,然後添加列id = integer(), Pairwise_Distance = numeric() nrow = 0的情況。

+0

我不完全確定,以上顯示了我。 76 id等實際告訴我什麼?我想,最終的問題是什麼,在我的實際數據中導致了這個問題? – Hjalte

+0

這76個值並不重要。重要的是它是非零長度值,並且您正在試圖將它「聯合」到零行data.table。你不能有data.table其中一些長度爲0的列(data.tables中的所有列),而其他的將有length = 1(id變量)。 – jangorecki

+0

你好Hjalte,我也得到了同樣的錯誤。是因爲數據表還是其他? –

相關問題