2013-10-11 29 views
9

我與R.初學者現在,我有一個data.frame矢量這樣獲取逗號之前的字符串爲R

city 
Kirkland, 
Bethesda, 
Wellington, 
La Jolla, 
Berkeley, 
Costa, Evie KW172NJ 
Miami, 
Plano, 
Sacramento, 
Middletown, 
Webster, 
Houston, 
Denver, 
Kirkland, 
Pinecrest, 
Tarzana, 
Boulder, 
Westfield, 
Fair Haven, 
Royal Palm Beach, Fl 
Westport, 
Encino, 
Oak Ridge, 

我想清洗它。我想要的是逗號前的所有城市名稱。我怎樣才能在R中得到結果?謝謝!

回答

11

可以使用gsub帶着幾分正則表達式的:

cities <- gsub("^(.*?),.*", "\\1", df$city) 

這一件作品,也:

cities <- gsub(",.*$", "", df$city) 
+0

+1我正要建議幾乎完全一樣......'GSUB(「^(+) ,。*「,」\\ 1「,df $ city)' –

2

你可以使用regexpr找到第一個逗號的每個元素的位置和請使用substr在此處剪切它們:

x <- c("London, UK", "Paris, France", "New York, USA") 

substr(x,1,regexpr(",",x)-1) 
[1] "London" "Paris" "New York" 
4

只是爲了好玩,你可以使用strsplit

> x <- c("London, UK", "Paris, France", "New York, USA") 
> sapply(strsplit(x, ","), "[", 1) 
[1] "London" "Paris" "New York" 
2

該作品,以及:

x <- c("London, UK", "Paris, France", "New York, USA") 

library(qdap) 
beg2char(x, ",") 

## > beg2char(x, ",") 
## [1] "London" "Paris" "New York"