0
我從可靠性值得懷疑一些數據源的數據:簡化數據
date | value | source
===================================
2011-09-30 | 10.9910 | best
2011-12-31 | 11.5000 | ok
2011-12-31 | 11.5290 | best
2012-03-31 | 12.8477 | ok
2012-03-31 | 12.4677 | worst
2012-06-30 | -1.5 | unacceptable
我想清理成一個簡單的時間序列,與基於數據源的優先順序: 「最好」擊敗「好」擊敗「最差」,並拋棄「不可接受」。在我的例子中:
date | value
========================
2011-09-30 | 10.9910
2011-12-31 | 11.5290
2012-03-31 | 12.8477
2012-06-30 | NA # or just skip this line
有關如何很好地做到這一點的任何想法?該dput
我的樣本數據是:
df = structure(list(date = structure(c(15247, 15339, 15339, 15430, 15430, 15491, 15613, 15613, 15705, 15795, 15795, 15886, 15978, 15978, 15978, 16070, 16070, 16070, 16160, 16160), class = "Date"),
value = c(10.991, 11.500, 11.529, 12.8477, 12.4677, 11.542, 12.1203, 12.1146, 12.5053, 13.3556, 13.3628, 13.3372, 13.844, 13.844, 13.8419, 15.3403, 15.3403, 15.3306, 15.202, 15.202 ),
source = c("best", "ok", "best", "ok", "worst", "ok", "ok", "worst", "ok", "ok", "worst", "unacceptable", "ok", "best", "worst", "ok", "best", "worst", "ok", "best")),
row.names = c(NA, 20L),
.Names = c("date", "value", "source"),
class = "data.frame")