2011-11-09 52 views
4

我目前使用wordle來進行詞雲的許多藝術用途。我認爲R的詞雲可能有更好的控制。wordcloud中的空格

1)你如何保留一個大寫字雲的詞? [解決]

2)如何將兩個單詞作爲wordcloud中的一個塊? (wordle使用〜操​​作符來實現這一點,R的詞雲只是照原樣打印〜)[例如,在「to」和「be」之間有一個〜我想要在詞雲的空間中]

require(wordcloud) 

y<-c("the", "the", "the", "tree", "tree", "tree", "tree", "tree", 
"tree", "tree", "tree", "tree", "tree", "Wants", "Wants", "Wants", 
"Wants", "Wants", "Wants", "Wants", "Wants", "Wants", "Wants", 
"Wants", "Wants", "to~be", "to~be", "to~be", "to~be", "to~be", 
"to~be", "to~be", "to~be", "to~be", "to~be", "to~be", "to~be", 
"to~be", "to~be", "to~be", "to~be", "to~be", "to~be", "to~be", 
"to~be", "when", "when", "when", "when", "when", "familiar", "familiar", 
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar", 
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar", 
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar", 
"leggings", "leggings", "leggings", "leggings", "leggings", "leggings", 
"leggings", "leggings", "leggings", "leggings") 

wordcloud(names(table(y)), table(y)) 
+0

您的原始代碼是可重複的,我根據我對這個問題的答案。您的編輯不再可重複,我的答案不再有意義。 – Andrie

+0

@Andrie對不起,其中一些是自定義函數,我刪除了將來參考 –

回答

4

你問了兩個問題:

  1. 您可以控制資本(或沒有)通過指定控制參數TermDocumentMatrix
  2. 毫無疑問,有一種說法某處控制~,但這裏是一個簡單的解決方法:使用gsub更改~在繪圖之前的步驟中對空白區域進行繪製。

一些代碼:

corpus <- Corpus(VectorSource(y)) 
tdm <- TermDocumentMatrix(corpus, control=list(tolower=FALSE)) ## Edit 1 

m <- as.matrix(tdm) 
v <- sort(rowSums(m), decreasing = TRUE) 
d <- data.frame(word = names(v), freq = v) 
d$word <- gsub("~", " ", d$word) ## Edit 2 

wordcloud(d$word, d$freq) 

enter image description here

+0

這是一個簡單的修復,我試圖用tm包來處理它,這是不必要的。謝謝。 –