R計數字符列

我想在同一行的不同列中添加一個包含字母a-z數量的列。R計數字符列

dataset$count <-length((gregexpr('[a-z]', as.character(dataset$text))[[1]]))

不起作用。

結果我想acheive：

text | count 
a  | 1 
ao | 2 
ao2 | 2 
as2e | 3 
as2eA | 3

來源

2011-06-17 Chris

您能舉個例子嗎？我可以用很多方式來解釋這一點。 – Andrie

當然...基本上我想統計每個小寫字母。 – Chris

整蠱之一：

nchar(gsub("[^a-z]","",x))

來源

2011-06-17 11:51:51 Marek

尼斯破解！肯定比'gregexpr'好。 – aL3xa

這應該做的伎倆：

numchars<-function(txt){ 
    #basically your code, but to be applied to 1 item 
    tmpres<-gregexpr('[a-z]', as.character(txt))[[1]] 
    ifelse(tmpres[1]==-1, 0, length(tmpres)) 
} 
#now apply it to all items: 
dataset$count <-sapply(dataset$text, numchars)

另一種選擇更是一個分兩步走的辦法的：

charmatches<-gregexpr('[a-z]', as.character(dataset$text))[[1]] 
dataset$count<-sapply(charmatches, length)

來源

2011-06-17 11:48:21

gregexpr似乎沒有像我預期的那樣工作： – Chris

gregexpr（'[az]'，as.character（「AAA」））[[1]]返回1，與gregexpr（'[az]'一樣）返回1 ，as.character（「AAAa」））[[1]] – Chris

對。對於沒有匹配，顯然gregexpr返回帶有一個元素的向量，保持-1。我編輯相應（雖然@馬立克的解決方案是冷卻） –

回答

相關問題