2012-05-10 93 views
1

我有這樣一個數據幀:如何添加缺少的序列值?

structure(list(x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 
24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 
37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 
50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 
63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 
76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 
89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L, 
101L, 102L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 112L, 
113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 123L, 124L, 
125L, 127L, 128L, 129L, 130L, 132L, 133L, 134L, 135L, 136L, 137L, 
138L, 139L, 140L, 141L, 142L, 143L, 145L, 146L, 147L, 148L, 149L, 
150L, 151L, 152L, 153L, 154L, 155L, 158L, 160L, 163L, 164L, 166L, 
167L, 169L, 170L, 173L, 174L, 178L, 179L, 181L, 182L, 183L, 186L, 
187L, 191L, 192L, 193L, 194L, 197L, 198L, 200L, 205L, 208L, 209L, 
213L, 214L, 216L, 217L, 220L, 222L, 223L, 225L, 229L, 233L, 235L, 
237L, 242L, 243L, 244L, 251L, 253L, 254L, 255L, 261L, 262L, 263L, 
264L, 267L, 268L, 269L, 270L, 276L, 281L, 282L, 284L, 285L, 287L, 
289L, 293L, 295L, 297L, 299L, 301L, 306L, 308L, 315L, 317L, 318L, 
320L, 327L, 330L, 336L, 337L, 345L, 346L, 355L, 359L, 376L, 377L, 
379L, 384L, 387L, 388L, 402L, 405L, 408L, 415L, 420L, 421L, 427L, 
428L, 429L, 430L, 437L, 438L, 439L, 440L, 446L, 448L, 453L, 456L, 
469L, 472L, 476L, 478L, 481L, 483L, 486L, 487L, 488L, 497L, 500L, 
502L, 504L, 507L, 512L, 525L, 530L, 531L, 543L, 546L, 550L, 578L, 
581L, 598L, 601L, 680L, 689L, 693L, 712L, 728L, 746L, 768L, 790L, 
794L, 840L, 851L, 861L, 928L, 969L, 1010L, 1180L, 1698L), freq = c(29186L, 
12276L, 5851L, 3938L, 3133L, 1894L, 1157L, 820L, 597L, 481L, 
398L, 297L, 269L, 251L, 175L, 176L, 153L, 130L, 117L, 108L, 93L, 
83L, 58L, 84L, 60L, 43L, 59L, 51L, 57L, 53L, 38L, 38L, 32L, 35L, 
28L, 27L, 29L, 22L, 24L, 29L, 30L, 23L, 26L, 19L, 19L, 25L, 14L, 
22L, 16L, 12L, 15L, 14L, 11L, 13L, 18L, 10L, 17L, 20L, 7L, 9L, 
2L, 8L, 12L, 8L, 7L, 10L, 10L, 9L, 6L, 6L, 9L, 5L, 11L, 4L, 5L, 
5L, 10L, 4L, 6L, 1L, 4L, 7L, 3L, 4L, 3L, 2L, 3L, 5L, 7L, 2L, 
2L, 3L, 2L, 4L, 7L, 1L, 3L, 5L, 5L, 3L, 5L, 2L, 2L, 2L, 3L, 2L, 
5L, 7L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 3L, 2L, 2L, 1L, 
3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 4L, 3L, 1L, 2L, 2L, 1L, 
1L, 1L, 1L, 2L, 3L, 1L, 1L, 3L, 2L, 1L, 1L, 1L, 4L, 4L, 1L, 2L, 
2L, 4L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 
3L, 2L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 4L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("x", 
"freq"), row.names = c(NA, -296L), class = "data.frame") 

130的x值後,有缺失值。有沒有辦法讓這個連續的數據框以1爲增量,即從1到1698,填充整個列表並將沒有值的元素設置爲0?我的意思是:

1,2 
4,5 
5,7 

應該轉換爲:

1,2 
2,0 
3,0 
4,5 
5,7 

有什麼建議?

回答

1

我想創建不由x柱覆蓋,然後創建這些值的數據幀,並分配0到所有這些x值的freq值的數據集。然後通過x訂購併訂購。

#I called your data dat 
y <- 1:max(dat$x) 
dat2 <- data.frame(x=y[!y%in%dat$x], freq=0) 
dat3 <- rbind(dat, dat2) 
dat4 <- dat3[order(dat3$x), ]    #could stop here 
rownames(dat4) <- NULL      #but I hate non sequential row names 
dat4 
1

您還可以使用merge(假設你的數據在l strored):

l <- merge(l,data.frame(x = 1:1698),all = TRUE,by = "x") 
l$freq[is.na(l$freq)] <- 0 
+0

如果你想少寫代碼和速度不是一個問題去Joran的+1,如果你想更快接近我的是選擇。 –