2015-10-21 100 views
0

我還有一個棘手的任務,我目前無法掌握。它與dataframes成交在R.如何轉換並在另一個數據幀中包含數據幀

說我有一個數據幀看起來像:

original = data.frame(Male = c(rep(1,3),rep(2,4),rep(3,2)), 
        SongNumber = c(1,2,3,1,2,3,4,1,2), 
        SongType = c("16a","16b","17a","24a","24b","25d","24f","5e","5e"), 
        Start = c(0.5,16.1,24.2,0.9,10.1,18.9,0.7,0.6,12.2), 
        RecordFile = c(rep("A1",3),rep("B1",3),"B2",rep("C1",2))) 
original 

,並含有每首歌曲類型的音節順序另一個數據幀:

additional = data.frame(SongType = c("16a","16b","17a","24a"), 
        Syll1 = c(4,4,3,16), 
        Syll2 = c(4,4,3,16), 
        Syll3 = c(84,84,3,3), 
        Syll4 = c(3,3,3,16), 
        Syll5 = c(16,16,3,3), 
        Syll6 = c(16,16,NA,4), 
        Syll7 = c(NA,16,NA,NA), 
        Syll8 = c(NA,16,NA,NA), 
        Syll9 = c(NA,3,NA,NA), 
        Syll10 = c(NA,1,NA,NA)) 
additional 

我希望現在插入音節順序作爲前一個數據框中的一列。最後的結果應該是這樣的:

aim = data.frame(Male = c(rep(1,21),rep(2,9),rep(3,2)), 
      SongNumber = c(rep(1,6),rep(2,10),rep(3,5),rep(1,6),2,3,4,1,2), 
      SongType = c(rep("16a",6),rep("16b",10),rep("17a",5),rep("24a",6),"24b","25d", 
          "24f","5e","5e"), 
      Start = c(rep(0.5,6),rep(16.1,10),rep(24.2,5),rep(0.9,6),10.1,18.9,0.7,0.6, 
         12.2), 
      RecordFile = c(rep("A1",21),rep("B1",8),"B2",rep("C1",2)), 
      SyllOrder = c(4,4,84,3,16,16,4,4,84,3,16,16,16,16,3,1,3,3,3,3,3,16,16,3,16,3,4, 
          NA,NA,NA,NA,NA)) 
aim 

到目前爲止,我沒有看到的功能,如合併如何幫助:僅合併的dataframe2列添加到dataframe1基於兩個數據幀之間的公共列。它不會強制dataframe1相應地添加行!

+0

好吧,@Thierry提供的答案是首先將'additional'轉換爲長格式無法在[如何連接(合併)數據幀(內部,外部,左側,右側)?] – KrisAnathema

回答

3

爲了得到所需的輸出,你可以這樣做:

library(data.table) 
additional2 <- melt(setDT(additional), id="SongType", na.rm=TRUE)[, .(SyllOrder = toString(value)), by = SongType] 

aim2 <- setDT(original)[additional2, SyllOrder := i.SyllOrder, on="SongType" 
         ][, lapply(.SD, function(x) unlist(tstrsplit(x, ",", fixed=TRUE))), 
          by=setdiff(names(original),"SyllOrder")] 

作爲一種替代的最後一步,你也可以使用:

aim2 <- additional2[original, on="SongType" 
        ][, lapply(.SD, function(x) unlist(tstrsplit(x, ",", fixed=TRUE))), 
         by=setdiff(names(original),"SyllOrder")] 

兩個結果:

> aim3 
    Male SongNumber SongType Start RecordFile SyllOrder 
1: 1   1  16a 0.5   A1   4 
2: 1   1  16a 0.5   A1   4 
3: 1   1  16a 0.5   A1  84 
4: 1   1  16a 0.5   A1   3 
5: 1   1  16a 0.5   A1  16 
6: 1   1  16a 0.5   A1  16 
7: 1   2  16b 16.1   A1   4 
8: 1   2  16b 16.1   A1   4 
9: 1   2  16b 16.1   A1  84 
10: 1   2  16b 16.1   A1   3 
11: 1   2  16b 16.1   A1  16 
12: 1   2  16b 16.1   A1  16 
13: 1   2  16b 16.1   A1  16 
14: 1   2  16b 16.1   A1  16 
15: 1   2  16b 16.1   A1   3 
16: 1   2  16b 16.1   A1   1 
17: 1   3  17a 24.2   A1   3 
18: 1   3  17a 24.2   A1   3 
19: 1   3  17a 24.2   A1   3 
20: 1   3  17a 24.2   A1   3 
21: 1   3  17a 24.2   A1   3 
22: 2   1  24a 0.9   B1  16 
23: 2   1  24a 0.9   B1  16 
24: 2   1  24a 0.9   B1   3 
25: 2   1  24a 0.9   B1  16 
26: 2   1  24a 0.9   B1   3 
27: 2   1  24a 0.9   B1   4 
28: 2   2  24b 10.1   B1  NA 
29: 2   3  25d 18.9   B1  NA 
30: 2   4  24f 0.7   B2  NA 
31: 3   1  5e 0.6   C1  NA 
32: 3   2  5e 12.2   C1  NA 
+0

謝謝!它看起來很聰明,但是當我在R中運行它時,我收到一條錯誤消息。這裏是消息:「[.data.table'(setDT(original),additional2,':='(SyllOrder,: 未使用的參數(on =「SongType」)「 – KrisAnathema

+0

你有哪個版本的'data.table'?你需要最新的一個來自CRAN(即* v1.9.6 *) – Jaap

+1

我有* v1.9.4 *!我更新了它現在完美運行,非常感謝! – KrisAnathema

1

您需要將additional轉換爲長格式。然後你可以合併它們。

library(dplyr) 
library(tidyr) 
additional %>% 
    gather("Syllable", "SyllOrder", -SongType) %>% 
    inner_join(original, by = "SongType") 
+0

謝謝@Thierry!完美,正是我需要的! – KrisAnathema

+0

@KrisAnathema這不會給出您在'aim'數據集中指定的輸出。 – Jaap

+0

是的,它幾乎是這樣。它只是添加了一列Syllable和SyllOrder與NA。但我可以將它們刪除! – KrisAnathema