stringi

2熱度

1回答

我想將字符串分成兩組。該字符串的結構非常簡單，但我無法使其工作。 txt <- "text12-01-2016" 它總是一些字母，後跟一個日期和日期，顯然是以數字開頭。我試過以下的正則表達式在https://regex101.com/和有效地得到妥善分隔字符串： ([a-zA-Z]*)([0-9].*) 1. "text" 2. "12-01-2016" 但是當我嘗試R中失敗： str

3熱度

1回答

比較兩個大字符串向量需要很長時間（刪除停用詞）

我正在嘗試爲機器學習準備一個數據集。在這個過程中，我想刪除（停止）出現次數很少的字詞（通常與糟糕的OCR讀數有關）。目前，我有一個包含大約1兆字的單詞列表，我想刪除它。但是，使用此設置處理我的數據集需要很長時間。 library(stringi) #generate the stopword list b <- stri_rand_strings(1000000, 4, pattern =

2熱度

1回答

找不到對象'C_stri_join' - 在Rstudio中使用knitr

在Rstudio中使用針織按鈕時，出現錯誤object 'C_stri_join' not found。下面是一個例子： --- title: "Sample Document" output: html_document: toc: true theme: united --- <!-- %\VignetteEngine{knitr::knitr}

9熱度

2回答

str_replace「NA」的意外行爲

我試圖將字符串轉換爲數字，並且遇到一些意外的行爲str_replace。這裏有一個最低工作例如： library(stringr) x <- c("0", "NULL", "0") # This works, i.e. 0 NA 0 as.numeric(str_replace(x, "NULL", "")) # This doesn't, i.e. NA NA NA as.nume

2熱度

1回答

根據r中的最後一個字對字符串進行排序

sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=

0熱度

1回答

之前特定的字符提取2項

我想提取Twitter的前兩個單詞@Handle x <- c("this is a @handle", "My name is @handle", "this string has @more than one @handle") 執行以下操作提取所有最後 @Handle只有前面的文字，我需要它的所有@處理 (ext <- stringr::str_extract_all(x, "^.*@"

1熱度

2回答

stringi的stri_replace_first_regex的更換不會被視爲一個正則表達式

我有一個字符串，其中我想，以取代相應的更換第一個匹配模式。 EG在我下面的例子：如果bb被首次發現，通過foo更換，不更換任何東西，但如果cc被首次發現，通過bar替換它並不會取代任何東西。此行爲幾乎根據需要，除了replacement參數不被解釋爲一個正則表達式，但作爲一個整體的字符串。（但根據需要pattern參數被視爲正則表達式）。 stri_replace_first_regex(

0熱度

1回答

使用stringi和rbind的R lapply

我想通過特定的字符串拆分數據框中的一些數據並計算出頻率。玩了幾個方法後，我想出了一個方法，但是在我的結果中有一個小小的錯誤。實施例：數據的幀數據的文件： data abc hello hello aaa zxy xyz 列表： list abc bcd efg aaa 我的代碼： lapply(list$list, function(x){ t <- da

3熱度

2回答

從字符串和文本數據中提取年份

我需要從具有這些屬性值的向量中提取開始年份和結束年份。 yr<- c("June 2013 – Present (2 years 9 months)", "January 2012 – June 2013 (1 year 6 months)","2006 – Present (10 years)","2002 – 2006 (4 years)") yr June 2013 – Presen

1熱度

1回答

程序包依賴性錯誤「沒有名爲'stringi'的程序包」

我創建了一個R程序包並將其加載到github（microdadosBrasil）。當我嘗試安裝包（作爲一個用戶會）我得到以下錯誤： devtools::install_github("lucasmation/microdadosBrasil") Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :