2017-08-09 31 views
1

我試圖從數據框gss中提取的矢量degree_abrev中進行一些字符串的自定義縮寫。R中的字符串向量的自定義縮寫

這是我能拿出...但我想看看是否有人有一個「漂亮」的方式...

degree_abrev <- gsub("Lt High School", "LtHS", gss$degree) 
degree_abrev <- gsub("High School", "HS", degree_abrev) 
degree_abrev <- gsub("Junior College", "JC", degree_abrev) 
degree_abrev <- gsub("Bachelor", "B", degree_abrev) 
degree_abrev <- gsub("Graduate", "G", degree_abrev) 
+1

我會把這些放在一個表中,並在它們上進行匹配/合併而不是正則化(假設這是可能的)。 – Frank

回答

1

「plyr」包有「mapvalues」功能做這個。我相信肯定還有其他方法可以做到這一點。

> degree_abbrev <- c("Lt High School", "High School", "Junior College", 
"Bachelor", "Graduate") 

> degree_abbrev 
[1] "Lt High School" "High School" "Junior College" "Bachelor"  
"Graduate"  

> degree_abbrev <- mapvalues(degree_abbrev, from = c("Lt High School", "High 
School", "Junior College", "Bachelor", "Graduate"), to = c("LtHS", "HS", 
"JC", "B", "G")) 

> degree_abbrev 
[1] "LtHS" "HS" "JC" "B" "G" 
+1

它的繼任者'dplyr'有'recode' –

+0

這很好,因爲我實際上使用了recode – jesusgarciab

0

我不知道這是否更漂亮,但我更喜歡使用sapply。

degree_abrev <- c("Lt High School", "High School", "Junior College", "Bachelor", "Graduate") 

sapply(strsplit(degree_abrev, " "), function(x){paste(substring(x, 1, 1), collapse = "")}) 
[1] "LHS" "HS" "JC" "B" "G"