0
假設我們有一個全文本文件作爲字符向量加載到R中。我正在尋找能夠在兩個「。」之間抽出所有文本的代碼,這兩個時期之間存在「和」以及至少一個「%」。提取滿足R中兩個條件的字符向量的句子
character <- as.character("Walmart stocks remained the same. Sony reported an increase, and the percent was posted at 1.0%. And the google also remained the same. And the percent of increase for Best Buy was 2.5%.")
考慮看看這個簡單的例子,我沿着線的
[1] Sony reported an increase, and the percent was posted at 1.0%.
[2] And the percent of increase for Best Buy was 2.5%.
工作就像一個魅力!只有在我的應用程序中使用來自Web的大型文本文件時纔會出現問題,因爲這些文件太長,句子會被截斷並繼續下一行。因此,我通過在我的readLines函數前面插入粘貼,將整個文本文件轉換爲單個字符矢量,如下所示:'paste(readLines(「websiteurl.txt」),collapse =「」)%>%' –