如何將字符串矢量轉換爲標題案例

我有一個小寫字母向量。我想將它們改爲標題案例，這意味着每個單詞的第一個字母將被大寫。我設法用雙循環來做到這一點，但我希望有一個更高效和更優雅的方式來做到這一點，或許是一個gsub和一個正則表達式。如何將字符串矢量轉換爲標題案例

下面是一些示例數據，以及工作的雙循環，其次是我嘗試過的其他操作無效。

strings = c("first phrase", "another phrase to convert", 
      "and here's another one", "last-one") 

# For each string in the strings vector, find the position of each 
# instance of a space followed by a letter 
matches = gregexpr("\\b[a-z]+", strings) 

# For each string in the strings vector, convert the first letter 
# of each word to upper case 
for (i in 1:length(strings)) { 

    # Extract the position of each regex match for the string in row i 
    # of the strings vector. 
    match.positions = matches[[i]][1:length(matches[[i]])] 

    # Convert the letter in each match position to upper case 
    for (j in 1:length(match.positions)) { 

    substr(strings[i], match.positions[j], match.positions[j]) = 
     toupper(substr(strings[i], match.positions[j], match.positions[j])) 
    } 
}

這工作，但它似乎非常複雜。我只是在用更直接的方法試驗失敗後才採取了這種做法。下面是一些我嘗試過的東西，用的輸出一起：

# Google search suggested \\U might work, but evidently not in R 
gsub("(\\b[a-z]+)", "\\U\\1" ,strings) 
[1] "Ufirst Uphrase"    "Uanother Uphrase Uto Uconvert" 
[3] "Uand Uhere'Us Uanother Uone" "Ulast-Uone"     

# I tried this on a lark, but to no avail 
gsub("(\\b[a-z]+)", toupper("\\1"), strings) 
[1] "first phrase"    "another phrase to convert" 
[3] "and here's another one" "last-one"

正則表達式捕獲每個字符串的正確位置，如圖通過調用gregexpr，但可根據需要替換字符串顯然是行不通的。

如果您還不能確定，我對正則表達式相對來說比較陌生，並且希望能夠幫助您瞭解如何使替換正常工作。我還想學習如何構造正則表達式，以避免在撇號後捕獲一個字母，因爲我不想更改這些字母的大小寫。

來源

2013-04-03 eipi10

主要的問題是你缺少perl=TRUE（你的正則表達式有點不對，儘管這可能是因爲試圖解決第一個問題而引起的）。

使用[:lower:]代替[a-z]是如果你的代碼最終在一些奇怪的（sorry, Estonians）區域，其中z不是字母表的最後一個字母被運行就更安全...

re_from <- "\\b([[:lower:]])([[:lower:]]+)" 
strings <- c("first phrase", "another phrase to convert", 
      "and here's another one", "last-one") 
gsub(re_from, "\\U\\1\\L\\2" ,strings, perl=TRUE) 
## [1] "First Phrase"    "Another Phrase To Convert" 
## [3] "And Here's Another One" "Last-One"

你可能喜歡使用\\E（停止資本化），而不是\\L（開始小寫），這取決於你想跟隨，如什麼樣的規則：

string2 <- "using AIC for model selection" 
gsub(re_from, "\\U\\1\\E\\2" ,string2, perl=TRUE) 
## [1] "Using AIC For Model Selection"

來源

2013-04-03 00:19:19

Hi @BenBolker，你的re_from應該是''\\ b（[[：alpha：]]（[[：alpha]] +）「'而不是'」\\ b（[[：lower：]] ）（[[：低：]] +）「'。否則，在最後的評論中使用'\\ E'沒有意義。 –

沒有使用regex，tolower的幫助頁面有兩個示例函數可以做到這一點。

更強大的版本是

capwords <- function(s, strict = FALSE) { 
    cap <- function(s) paste(toupper(substring(s, 1, 1)), 
        {s <- substring(s, 2); if(strict) tolower(s) else s}, 
          sep = "", collapse = " ") 
    sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s))) 
} 
capwords(c("using AIC for model selection")) 
## -> [1] "Using AIC For Model Selection"

爲了讓您的regex辦法（幾乎）工作，你需要設置`perl的= TRUE）

gsub("(\\b[a-z]{1})", "\\U\\1" ,strings, perl=TRUE) 


[1] "First Phrase"    "Another Phrase To Convert" 
[3] "And Here'S Another One" "Last-One"

但你需要稍微處理撇號或許更好

sapply(lapply(strsplit(strings, ' '), gsub, pattern = '^([[:alnum:]]{1})', replace = '\\U\\1', perl = TRUE), paste,collapse = ' ')

SO的快速搜索找到了https://stackoverflow.com/a/6365349/1385941

來源

2013-04-03 00:15:40 mnel

不幸的是，我沒有這麼快速搜索SO，並沒有提出你提到的問題。我嘗試了「將字符串轉換爲標題大小寫」，「將每個單詞的首字母轉換爲大寫字母」，「將每個單詞的首字母大寫」等，但不知何故未擊中魔術搜索字符串。無論如何，我很高興能夠回答我的問題，因爲他們爲正則表達式的工作增加了更多選項和一些額外的見解。 – eipi10

這裏已經有了很好的答案。這裏有一個使用報告包中的便利功能：

strings <- c("first phrase", "another phrase to convert", 
    "and here's another one", "last-one") 

CA(strings) 

## > CA(strings) 
## [1] "First Phrase"    "Another Phrase To Convert" 
## [3] "And Here's Another One" "Last-one"

儘管它沒有大寫一個，因爲它對我的目的沒有意義。

更新我管理的qdapRegex包具有TC（職稱情況）函數，它真正的首字母大寫：

TC(strings) 

## [[1]] 
## [1] "First Phrase" 
## 
## [[2]] 
## [1] "Another Phrase to Convert" 
## 
## [[3]] 
## [1] "And Here's Another One" 
## 
## [[4]] 
## [1] "Last-One"

來源

2013-04-03 00:27:16

我會扔一個更混進去的樂趣：

topropper(strings) 
[1] "First Phrase"    "Another Phrase To Convert" "And Here's Another One" 
[4] "Last-one" 

topropper <- function(x) { 
    # Makes Proper Capitalization out of a string or collection of strings. 
    sapply(x, function(strn) 
    { s <- strsplit(strn, "\\s")[[1]] 
     paste0(toupper(substring(s, 1,1)), 
      tolower(substring(s, 2)), 
      collapse=" ")}, USE.NAMES=FALSE) 
}

來源

2013-04-03 04:00:27

如何將字符串矢量轉換爲標題案例

回答

相關問題