2015-10-05 50 views
3

以下簡單循環似乎跳過數據框中的元素。我希望能夠找出有關數據/代碼問題的地方。R - 並非所有返回的元素(for循環)

foo <- apply(data, 1, function(x) { 

    vec <- x 
    mylist <- list() 

    for (i in vec){ 
     #print(i) 
     mylist[[i]]<-i 
    } 
    print(length(vec)) 
    print(length(mylist)) 
}) 

我的數據框有25列。對於一些行,length(vec)返回25,而length(mylist)返回24

[1] 25 
[1] 24 

如果我使用散列出print(i)我可以看到所有行25元。

以上是我想要使用的實際代碼的簡化,但問題已經出現在這種簡單的格式中。

在此先感謝!

PS。我曾嘗試將數據視爲字符或因素。這似乎也沒有影響到這個問題。

PPS。該數據幀的兩行給出不同的結果(雖然它們包含相同數量的元素):

structure(list(data1.LOC = c("LL_A1_00000003068_686", "LL_A1_00000003538_274"), REF = c("G", "T"), ALT = c("C", "C"), L47.variant = c("0/1:28,34:62:99:1154,0,926", "0/0:9,0:9:21:0,21,276"), L51.variant = c("0/0:61,0:61:99:0,184,2417", "0/0:6,0:6:15:0,15,192"), LCro11.variant = c("0/0:24,0:24:72:0,72,951", "0/0:2,0:2:6:0,6,80"), LCro5.variant = c("0/0:48,0:48:99:0,141,1869", "0/0:5,0:5:15:0,15,173"), N01.variant = c("0/1:22,16:38:99:526,0,758", "1/1:0,2:2:6:63,6,0"), N09.variant = c("1/1:1,50:51:99:1885,110,0", "0/0:12,0:12:36:0,36,460"), Nor28.variant = c("1/1:0,23:23:66:874,66,0", "0/0:5,0:5:12:0,12,159"), P161.variant = c("1/1:0,54:55:99:2118,163,0", "0/0:2,0:2:6:0,6,80"), Rom155.variant = c("0/0:69,0:69:99:0,208,2749", "0/1:5,3:8:99:102,0,102"), Rom161.variant = c("0/0:75,0:75:99:0,226,2957", "0/0:5,0:5:15:0,15,196"), Rom303.variant = c("0/0:44,0:44:99:0,132,1739", "0/0:5,0:5:15:0,15,195"), Rus291.variant = c("0/1:43,30:73:99:972,0,1443", "0/1:1,3:4:28:108,0,28"), Rus292.variant = c("0/0:56,0:56:99:0,163,2139", "0/0:11,0:11:33:0,33,429"), Sl5t.variant = c("0/1:55,34:89:99:1003,0,1911", "0/0:10,0:10:30:0,30,379"), Sl6t.variant = c("0/0:89,0:89:99:0,268,3513", "0/0:10,0:10:30:0,30,383"), s037y.variant = c("0/0:63,0:63:99:0,190,2484", "0/0:8,0:8:18:0,18,236"), s087y.variant = c("0/0:72,0:72:99:0,211,2770", "0/0:6,0:6:15:0,15,179"), s2E03.variant = c("0/1:34,27:61:99:810,0,1175", "0/0:4,0:4:12:0,12,143"), s2L05.variant = c("0/0:56,0:56:99:0,169,2220", "0/1:4,4:8:95:139,0,95"), s2P01.variant = c("0/1:44,27:71:99:859,0,1519", "0/0:6,0:6:18:0,18,240"), s2R01.variant = c("1/1:0,68:68:99:2642,202,0", "0/1:5,6:11:99:202,0,130"), s2R05.variant = c("0/1:41,33:74:99:1012,0,1393", "0/0:8,0:8:24:0,24,312")), .Names = c("data1.LOC", "REF", "ALT", "L47.variant", "L51.variant", "LCro11.variant", "LCro5.variant", "N01.variant", "N09.variant", "Nor28.variant", "P161.variant", "Rom155.variant", "Rom161.variant", "Rom303.variant", "Rus291.variant", "Rus292.variant", "Sl5t.variant", "Sl6t.variant", "s037y.variant", "s087y.variant", "s2E03.variant", "s2L05.variant", "s2P01.variant", "s2R01.variant", "s2R05.variant"), row.names = 19:20, class = "data.frame") 
+1

一般來說,你想分配到一個列表,如'mylist [[i]] < - i'不知道是否會解決這個問題。另外,雖然簡化問題很好,但在這種情況下,提供可重現的示例(使用數據)也很重要:http://stackoverflow.com/a/28481250/1191259 – Frank

+0

感謝您提供快速響應!我添加了兩行數據框。 saddly,你的建議沒有幫助(但是,謝謝!) – dwf

+4

請用'dput()'分享你的數據,獲得適當的底層結構對此很重要。 – Gregor

回答

2

代替使用的vec元件訪問的mylist元素,從而更新相同的元件在一式兩份的情況下

foo <- apply(data, 1, function(x) { 

    vec <- x 
    mylist <- list() 

    for (i in seq(vec)){ 
     #print(i) 
     mylist[[i]] <- vec[i] 
} 
    print(length(vec)) 
    print(length(mylist)) 
}) 

總而言之,你的代碼沒有工作,因爲::你可能有重複的,你應該通過vec逐一通過運行它的長度,像這樣一個標準的整數索引i迭代3210。例如,如果vec<-c(1,1,2), length(vec)==3但它會導致length(mylist)==2。 #評論來自nicola 1小時前

+0

非常感謝!這一直讓我發瘋。我實現了你對大輸入文件的建議,並且它很有魅力 – dwf

+0

單詞解釋變化,爲什麼OP的方法沒有像預期的那樣工作...... – Frank

+2

@ nicola上面的評論是解釋。 –