R 3.5.0支持正則表達式\\ L嗎？

我遇到與perl的表達\\L\\1困難在R-dev的非常特別的情況下（2017年6月6日和2017年6月16日r72796版本）：R 3.5.0支持正則表達式\ L嗎？

bib <- readLines("https://raw.githubusercontent.com/HughParsonage/TeXCheckR/master/tests/testthat/lint_bib_in.bib", encoding = "UTF-8") 

leading_spaces <- 2 

is_field <- grepl("=", bib, fixed = TRUE) 
field_width <- nchar(trimws(gsub("[=].*$", "", bib, perl = TRUE))) 

widest_field <- max(field_width[is_field]) 

out <- bib 

# Vectorized gsub: 
for (line in seq_along(bib)){ 
    # Replace every field line with 
    # two spaces + field name + spaces required for widest field + space 
    if (is_field[line]){ 
    spaces_req <- widest_field - field_width[line] 
    out[line] <- 
     gsub("^\\s*(\\w+)\\s*[=]\\s*\\{", 
      paste0(paste0(rep(" ", leading_spaces), collapse = ""), 
        "\\L\\1", 
        paste0(rep(" ", spaces_req), collapse = ""), 
        " = {"), 
      bib[line], 
      perl = TRUE) 
    } 
} 

# Add commas: 
out[is_field] <- gsub("\\}$", "\\},", out[is_field], perl = TRUE) 

out[9] 
#> R-dev " author" 
#> R 3.4.0 " author  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

要重現，它是必要的：

從readLines從一個文件，並指定編碼。（使用dput將不會重現）
在perl正則表達式中使用\\L或\\U。
要使用的字符
的向量要具有需要UTF-8（E在天使愛美麗在上述），該向量的元素

這是中的R 3.5.0的變化，或者有在這種情況下，我一直在誤用\\L？

來源

2017-06-16 Hugh

瞧，你已經被警告：[*它可能包含的錯誤，所以要小心，如果你使用它。*]（https://cran.r-project.org /bin/windows/base/rdevel.html）。 –

我無法構建代碼段 - 什麼是'leading_spaces'？ –

這個特定的錯誤是在一個包的R CMD檢查中導致錯誤。對不起，我編輯過。 – Hugh

顯然有一些意想不到的行爲。

當提及\1，它的工作原理輸出：

[1] " author  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

然而，每當\U或\L使用具有\1，第二反向引用被除去。

"\\U\\1"：[1] " AUTHOR"
"\\U\\1\\E\\2"：[1] " AUTHOR"

甲gsubfn溶液仍然有效（在此，與toupper()爲例）：

library(gsubfn) 
bib <- readLines("https://raw.githubusercontent.com/HughParsonage/TeXCheckR/master/tests/testthat/lint_bib_in.bib", encoding = "UTF-8") 
leading_spaces <- 2 
is_field <- grepl("=", bib, fixed = TRUE) 
field_width <- nchar(trimws(gsub("[=].*$", "", bib, perl = TRUE))) 
widest_field <- max(field_width[is_field]) 
out <- bib 

# Vectorized gsub: 
for (line in seq_along(bib)){ 
    # Replace every field line with 
    # two spaces + field name + spaces required for widest field + space 
    if (is_field[line]){ 
    spaces_req <- widest_field - field_width[line] 
    out[line] <- 
     gsubfn("^\\s*(\\w+)\\s*=\\s*\\{", 
      function(y) paste0(
        paste0(rep(" ", leading_spaces), collapse = ""), 
        toupper(y), 
        paste0(rep(" ", spaces_req), collapse = ""), 
        " = {" 
      ), 
      bib[line], engine="R" 
    ) 
    } 
} 
# Add commas: 
out[is_field] <- gsub("\\}$", "},", out[is_field], perl = TRUE) 

out[9]

輸出：

[1] " AUTHOR  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

個

我sessionInfo細節：

> sessionInfo() 
R Under development (unstable) (2017-06-19 r72808) 
Platform: i386-w64-mingw32/i386 (32-bit) 
Running under: Windows 7 x64 (build 7601) Service Pack 1 

Matrix products: default 

locale: 
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252 
[3] LC_MONETARY=English_United States.1252 
[4] LC_NUMERIC=C       
[5] LC_TIME=English_United States.1252  

attached base packages: 
[1] stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] gsubfn_0.6-6 proto_1.0.0 

loaded via a namespace (and not attached): 
[1] compiler_3.5.0 tools_3.5.0 tcltk_3.5.0

來源

2017-06-19 13:28:37

R 3.5.0支持正則表達式\\ L嗎？

回答

相關問題