請幫我解決我的小型項目。將strsplit(...)textvectors拆分爲R
有一個大的文本元素列表。每個元素都應該被分成一小段句子。每個小列表應該像原始文本元素一樣,作爲一個元素保存到相同位置('行')的初始大列表的新列中。
分解標準是"/$"
,"und/KON"
,"oder/KON"
。這應該保留在新的小單元素的頭部。
我試過用正則表達式如"/$|und/KON|oder/KON"
和manny組合轉義"$"
,"|"
,"/"
。此外,我試圖改變參數perl = TRUE
,fixed = TRUE
和FALSE
。每次我嘗試注意都會發生。似乎|
解釋不正確。你建議如何解決這個問題?
library(stringr) # don't know if it's required
# Input list to be splitted at each
# "/$", "und/KON", "oder/KON"
# but should keep the expression at the start of the next list element
#
# Would be nice but not necessary: The small-list to be named after the ID in the first column
> r <- list(ID=c(01, 02, 03),
elements=c("This should become my first small-list :/$. the first element ,/$, the second element ,/$, and the third element ./$.",
"This should become my second small-list :/$. Element eins und/KON Element zwei oder/KON Element drei ./$.",
"This should become my third small-list :/$. Element Alpha und/KON Element Beta oder/KON Element Gamma ./$.")
# Would look something like
r$small_lists <- sapply(r$elements ,function(x) as.list(strsplit(x,"/$|und/KON"|oder/KON", fixed=TRUE)))
> r$small_lists
$01
[1] "This should become my first small-list "
[2] ":/$. the first element "
[3] ",/$, the second element "
[4] ",/$, and the third element "
[5] "./$."
$02
[1] "This should become my second small-list "
[2] ":/$. Element eins "
[3] "und/KON Element zwei "
[4] "oder/KON Element drei"
[5] "./$."
$03
[1] "This should become my third small-list "
[2] ":/$. Element Alpha "
[3] "und/KON Element Beta "
[4] "oder/KON Element Gamma "
[5] "./$."
> class(r)
[1] "list"
> class(r$small_lists)
[1] "list"
我沒有看到一個問題在這裏了。 – A5C1D2H2I1M1N2O1R2T1
@AnandaMahto:對不起,謝謝,完成:) – alex
謝謝!)爲了讓我更好的理解,你能解釋一下''&^ \\ 1「'分別是什麼'」^&*「'工作? – alex