則很可能是更簡潔的方式,但你可以嘗試這樣的事:
library(stringi)
library(data.table)
# Drop empty lines if any
txt <- Filter(function(x) !stri_isempty(stri_trim(x)), x)
# Extract matches
matches <- stri_match_all_regex(txt, "([\\w\\s]+)\\(([1-9]+)%\\);?")
matches[[1]]
## [,1] [,2] [,3]
## [1,] "Chicken(31%);" "Chicken" "31"
## [2,] "Duck(16%);" "Duck" "16"
## [3,] "Wild duck(14%);" "Wild duck" "14"
## [4,] "Pigeon(4%);" "Pigeon" "4"
## [5,] "Goose(4%);" "Goose" "4"
## [6,] "Wild bird(4%);" "Wild bird" "4"
## [7,] "Tree sparrow(2%)" "Tree sparrow" "2"
# Rearrange
rows <- lapply(
matches,
function(x) setNames(as.list(as.numeric(x[, 3])), x[, 2]))
rbindlist(rows, fill=TRUE)
## Chicken Duck Wild duck Pigeon Goose Wild bird Tree sparrow
## 1: 31 16 14 4 4 4 2
## 2: NA NA NA NA NA NA 2
## 3: 1 NA NA NA NA NA NA
正則表達式的解釋
([\\w\\s]+) # At least one word character or whitespace *, 1st group
\\(# Left parenthesis
([1-9]+) # At least one digit. You can replace + with {1,2}, 2nd group
% # Percent sign
\\) # Right parenthesis
;? # Optional semicolon
*可能是\\w[\\w\\s]+
你試過了什麼?分號是列的分隔符嗎?如果一行少於8個條目,你想填寫什麼值? – dd3