我有一些關於ID和開始日期組合的ID,日期和整數值的數據,每個ID有多個日期。查找數據表中的重疊
我想創建指示柱:
1)告訴我如果ID具有從整數,或4個獨立的整數在12個月內的計數的總和> = 14。
這裏有一個類似的問題,但我的類別是一個比較複雜: Create new column based on condition that exists within a rolling date
任何幫助,不勝感激!
下面是一些數據的dput:
structure(list(ID = c("90939293", "90963328", "90092983",
"90032926", "90944838", "90092983", "90062392", "90224939", "90202398",
"90926203", "90936043", "90329263", "90944838", "90232033", "90980903",
"90924463", "90299292", "90933383", "90209349", "90092983", "90022988",
"90022293", "90933383", "90092983", "90299240", "90963033", "90004923",
"90292998", "90986096", "90980903", "90336692", "90933383", "90022988",
"90069992", "90062392", "90209248", "90924463", "90092983", "90933383",
"90022293", "90062392", "90004923", "90233269", "90329263", "90229202",
"90309943", "90299292", "90036820", "90329263", "90232033", "90329263",
"90336692", "90963033", "90224939", "90924463", "90069992", "90092983",
"90934923", "90926203", "90222333", "90092983", "90299292", "90202398",
"90004923", "90233269", "90926203", "90222333", "90224939", "90232033",
"90933383", "90022293", "90022988", "90934923", "90069992", "90329263",
"90209349", "90022293", "90309943", "90299240", "90022293", "90336692",
"90020334", "90933383", "90290384", "90224939", "90980903", "90299240",
"90299292", "90202398", "90022346"), Date = structure(c(15972,
16009, 16010, 16010, 16007, 16010, 16006, 16010, 16007, 16008,
15997, 16007, 16007, 16002, 16008, 16006, 16006, 16006, 16009,
16010, 16006, 16006, 16006, 16010, 15995, 16008, 16008, 16010,
16009, 16008, 16010, 16006, 16006, 16009, 16006, 16006, 16006,
16010, 16006, 16006, 16006, 16008, 16009, 16007, 16010, 16007,
16006, 16009, 16007, 16002, 16007, 16010, 16008, 16010, 16006,
16009, 16010, 15936, 16008, 16008, 16010, 16006, 16007, 16008,
16009, 16008, 16008, 16010, 16002, 16006, 16006, 16006, 15936,
16009, 16007, 16009, 16006, 16007, 15995, 16006, 16010, 16006,
16006, 16010, 16010, 16008, 15995, 16006, 16007, 16008), class = "Date"),
Integer = c(39, 2, 1, 1, 4, 1, 5, 1, 4, 3, 14, 4, 4, 9,
3, 5, 5, 5, 2, 1, 5, 5, 5, 1, 16, 3, 3, 1, 2, 3, 1, 5, 5,
2, 5, 5, 5, 1, 5, 5, 5, 3, 2, 4, 1, 4, 5, 2, 4, 9, 4, 1,
3, 1, 5, 2, 1, 75, 3, 3, 1, 5, 4, 3, 2, 3, 3, 1, 9, 5, 5,
5, 75, 2, 4, 2, 5, 4, 16, 5, 1, 5, 5, 1, 1, 3, 16, 5, 4,
3)), .Names = c("ID", "Date", "Integer"
), row.names = c("200086", "200066", "200050", "200064", "200078",
"200050.1", "200069", "200082", "200083", "200053", "200056",
"200055", "200078.1", "200079", "200051", "200089", "200052",
"200057", "200061", "200050.2", "200060", "200080", "200057.1",
"200050.3", "200068", "200071", "200070", "200059", "200062",
"200051.1", "200067", "200057.2", "200060.1", "200072", "200069.1",
"200073", "200089.1", "200050.4", "200057.3", "200080.1", "200069.2",
"200070.1", "200081", "200054", "200063", "200075", "200052.1",
"200074", "200054.1", "200079.1", "200055.1", "200067.1", "200071.1",
"200082.1", "200089.2", "200072.1", "200050.5", "200084", "200053.1",
"200088", "200050.6", "200052.2", "200083.1", "200070.2", "200081.1",
"200053.2", "200088.1", "200082.2", "200079.2", "200057.4", "200080.2",
"200060.2", "200084.1", "200072.2", "200055.2", "200061.1", "200080.3",
"200075.1", "200068.1", "200080.4", "200067.2", "200065", "200057.5",
"200090", "200082.3", "200051.2", "200068.2", "200052.3", "200083.2",
"200076"), class = "data.frame")
「有每個ID多個日期」 - '任何(複製(DF $ X1))'你不同意,你的樣本數據。你的ID(第一列,我假設它們在你的例子中只叫X1)是唯一的。或者你的意思是某些日期有多個ID?無論哪種方式,請製作一個**小**示例而不是100行。 – Spacedman
這並不明確:「告訴我一個ID在12個月內是否有14個整數或4個獨立整數的總和」。 「14個整數的和」是什麼意思? 1 + 2 + 3 + 4 + 1 + 2 + 3 + 4 + 1 + 2 + 3 + 4 + 7 + 99是14個整數的和。你不是那個意思嗎? – Spacedman
我認爲你可能在這裏提出太多問題,所以不鼓勵部分答案,所以除非有一個人解決你的所有問題,否則你將不會得到任何答案。建議你刪除這個帖子,並創建幾個 - 第一個將是如何找到哪些ID的總和的「整數」列值等於14. – Spacedman