分裂開字符向量爲單個單詞中的R

我有一個字符向量（VEC）是這樣的：分裂開字符向量爲單個單詞中的R

[1] "super good dental associates" "cheap dentist in bel air md"  
    "dentures "     "dentures "      
    "in office teeth whitening"  "in office teeth whitening"  
    "dental gum surgery bel air, md" 
[8] "dental implants"    "dental implants"     
    "veneer teeth pictures"

我需要分開打破這種爲個體話。我嘗試這樣做：

singleWords <- strsplit(vec, ' ')[[1]]

，但是，我只得到該向量的第一個元素上分割：

[1] "super"  "good"  "dental"  "associates"

我怎樣才能獲得的所有單詞作爲單個元件的單一載體？

來源

2014-04-03 Cybernetic

嘗試'sapply通話（VEC， strsplit，「」）'，你可以用'unlist'包裝它，如果你想要它們在一個向量中 –

這不是拆分單個詞:( – Cybernetic

我添加了一個例子使用'sapply' –

你可以嘗試：

strsplit(paste(vec, collapse = " "), ' ')[[1]]

來源

2014-04-03 18:07:20

完美!!!謝謝你:) – Cybernetic

如果這篇文章解決了你的問題，@Cybernetic，請考慮通過點擊左邊的複選標記來接受它。總投票。 – gung

@gung感謝你！ –

只是爲了確認我的意見，因爲你提到它不工作，一起來看看。由於有幾個元素有額外的空間，我建議使用\\s+作爲正則表達式來分割，而不是從我的評論的單個空間。乾杯。

> (newVec <- unlist(sapply(vec, strsplit, "\\s+", USE.NAMES = FALSE))) 
# [1] "super"  "good"  "dental"  "associates" "cheap"  "dentist" 
# [7] "in"   "bel"  "air"  "md"   "dentures" "dentures" 
#[13] "in"   "office"  "teeth"  "whitening" "in"   "office"  
#[19] "teeth"  "whitening" "dental"  "gum"  "surgery" "bel"  
#[25] "air,"  "md"   "dental"  "implants" "dental"  "implants" 
#[31] "veneer"  "teeth"  "pictures"

而且因爲我在那裏看到一個流浪逗號，它可能是清理所有的標點符號一個好主意（如果有的話遺體）一起gsub

> gsub("[[:punct:]]", "", newVec)

來源

2014-04-03 18:35:13

分裂開字符向量爲單個單詞中的R

回答

相關問題