IN R計數分級數據

我有一個縣在1995 - 2005年，每個州收到未達到狀態的列表。IN R計數分級數據

我想知道每個州每年有多少縣接受這種狀態。

如果我的數據被格式化這樣，

State1 Country1 YR1 Yr2 Yr3 Yr4... 
State1 Country2 YR1 Yr2 Yr3 Yr4 
State2 County1 Yr1 Yr2.....

每年變量可以有1或爲零，因爲一個縣可能獲得或在一段時間失去這一地位。

我需要每年統計每個州有多少縣有未達標狀態（YRx = 1），但不能想到如何去做。

來源

2010-06-18 Alison

此數據是否被組織爲數據框？如果是這樣，那麼行如何定義？如果您的數據按以下方式組織：

State County Year Attainment 
State1 County1 1  1 
State1 County1 2  0 
State1 County1 3  1 
State1 County1 4  1 
State1 County2 1  1 
State1 County2 2  1 
...

然後，您可以使用1行代碼獲取您正在查找的摘要數據類型。希望您的符號意味着你的數據是這樣的組織：

State County Yr1 Yr2 Yr3 Yr4 
State1 County1 1 0 1 1 
State1 County2 1 1 1 1

使用melt()從reshape包從格式到一個以上的佈局得到。

new.df <- melt(df, id = 1:2)

它會調用年度變量variable和心得變量value。現在，通過巧妙使用cast函數，也可以從reshape包中獲得所需的摘要。

counties <- cast(new.df, State ~ value, fun = length) 
head(counties)

但是，如果你的數據組織，以便每一個州，縣，今年是一列，而且只有1行久了，我想你最好的下一步將是重新格式化的R之外的數據，使得它至少像我的第二個例子。

來源

2010-06-18 18:30:23 JoFrhwld

它的組織的第二種方式。我想我正確地格式化了我的問題，但是當我發佈它時，它合併爲一行。 – Alison 2010-06-18 19:31:11

JoFrhwld-我做到了，但它把我所有年份的總結，而不是每年的總結。我仍然在處理這個問題，如果您有任何建議，我將不勝感激。至少你的幫助讓我進一步！ – Alison 2010-06-21 17:38:39

我用下面的例子：

data <- read.table(textConnection(" 
state county Yr1 Yr2 Yr3 Yr4 
state1 county1 1 0 0 1 
state1 county2 0 0 0 0 
state1 county3 0 1 0 0 
state1 county4 0 0 0 0 
state1 county5 0 1 0 1 
state2 county6 0 0 0 0 
state2 county7 0 0 1 0 
state2 county8 1 0 0 1 
state2 county9 0 0 0 0 
state2 county10 0 1 0 0 
state3 county11 1 1 1 1 
state3 county12 0 0 0 0 
state3 county13 0 1 1 0 
state3 county14 0 0 0 1 
state4 county15 0 0 0 0 
state4 county16 1 0 1 0 
state4 county17 0 0 0 0 
state4 county18 1 1 1 1 
"), header = T) 

library(reshape) 
data2 <- melt(data, id = c("state", "county")) 
cast(data2, state ~ variable, fun = sum)

結果：

state Yr1 Yr2 Yr3 Yr4 
1 state1 1 2 0 2 
2 state2 1 1 1 1 
3 state3 1 2 2 2 
4 state4 2 1 2 1

來源

2010-06-18 18:51:39

哇，非常感謝。整個下午我都在苦苦掙扎。我會嘗試一下。 – Alison 2010-06-18 19:22:40

IN R計數分級數據

回答

相關問題