如何將文本分類爲R中的組？

我有話向量如何將文本分類爲R中的組？

c('Apple','Orange','Apple','Carrot','Onion','Onion')

我想在

list('fruit' = c('Apple', 'Orange'), 
'vegetable' = c('Carrot','Onion')

我要找的輸出對其進行分類是

c('fruit', 'fruit', 'fruit', 'vegetable', 'vegetable', 'vegetable') .

我目前的做法是給每個轉換他們到data.table並使用merge獲取類別。還有其他更簡單的解決方案嗎？

來源

2016-06-30 imsc

在您簡單的情況下，你可以用向量堅持。我會嘗試'l < - c（'Apple'，'Orange'，'胡蘿蔔'，'洋蔥'）; m < - rep（c（「水果」，「蔬菜」），每個= 2）; m [match（x，l）]'。我認爲我們有很多這方面的蠢事。 –

由於你已經有一個查找列表，@Marek在鏈接的dupe中的答案應該沒問題。它肯定是乾淨的（「這是迄今爲止最簡單的方法」，J. Ulrich） – Henrik

[「字符匹配提供了查找表的強大方法」]（http://adv-r.had.co.nz/Subsetting .html＃applications）（也出現在鏈接的dupe中）;也許更簡單一些，如果你的查詢表的結構不同。 – Henrik

這裏有一個替代

x <- c('Apple','Orange','Apple','Carrot','Onion','Onion') 
lst <- list('fruit' = c('Apple', 'Orange'), 
'vegetable' = c('Carrot','Onion')) 
with(stack(lst), ind[match(x, values)]) 
# [1] fruit  fruit  fruit  vegetable vegetable vegetable

來源

2016-06-30 21:19:22 lukeA

如何將文本分類爲R中的組？

回答

相關問題