像往常一樣困難的部分是收集數據,但我碰巧從US Census歸檔。所以運行的代碼以下行運行後下方的 「國家/地區數據」 部分:
df <- data.frame(emails=c("[email protected]","[email protected]","[email protected]",
"[email protected]","[email protected]"),
states=c("NV","CA","UT","AZ","IA"))
df$regions <- sapply(df$states,
function(x) names(region.list)[grep(x,region.list)])
#Then write to desktop, for example, with:
write.csv(df,"~/Desktop/nameHere.csv",row.names=FALSE)
輸出:
emails states regions
1 [email protected] NV West
2 [email protected] CA West
3 [email protected] UT West
4 [email protected] AZ West
5 [email protected] IA Midwest
國家/地區的數據:
NE.name <- c("Connecticut","Maine","Massachusetts","New Hampshire",
"Rhode Island","Vermont","New Jersey","New York",
"Pennsylvania")
NE.abrv <- c("CT","ME","MA","NH","RI","VT","NJ","NY","PA")
NE.ref <- c(NE.name,NE.abrv)
MW.name <- c("Indiana","Illinois","Michigan","Ohio","Wisconsin",
"Iowa","Kansas","Minnesota","Missouri","Nebraska",
"North Dakota","South Dakota")
MW.abrv <- c("IN","IL","MI","OH","WI","IA","KS","MN","MO","NE",
"ND","SD")
MW.ref <- c(MW.name,MW.abrv)
S.name <- c("Delaware","District of Columbia","Florida","Georgia",
"Maryland","North Carolina","South Carolina","Virginia",
"West Virginia","Alabama","Kentucky","Mississippi",
"Tennessee","Arkansas","Louisiana","Oklahoma","Texas")
S.abrv <- c("DE","DC","FL","GA","MD","NC","SC","VA","WV","AL",
"KY","MS","TN","AR","LA","OK","TX")
S.ref <- c(S.name,S.abrv)
W.name <- c("Arizona","Colorado","Idaho","New Mexico","Montana",
"Utah","Nevada","Wyoming","Alaska","California",
"Hawaii","Oregon","Washington")
W.abrv <- c("AZ","CO","ID","NM","MT","UT","NV","WY","AK","CA",
"HI","OR","WA")
W.ref <- c(W.name,W.abrv)
region.list <- list(
Northeast=NE.ref,
Midwest=MW.ref,
South=S.ref,
West=W.ref)
也許你需要'split(df1 $ states,df1 $ regions)'或者你需要一個單獨的列,然後用'dcast'即ie library(data.table); dcast(setDT(df1),rowid(regions)〜regions,value.var =「states」)' – akrun
@ akrun..Thanku開始了..但我有一個快速的問題..我將如何將這些狀態組合地區?因爲這個區域列是我想要的輸出 – sim
我認爲最好的選擇是使用'split'使用'列表',如上面在我的評論中提到 – akrun