我有CLUTO格式的文檔詞矩陣:CLUTO文檔詞矩陣,以TM DocumentTermMatrix
#Document #Term #TotalItem
term-x weight-x term-y weight-y (for only nonzeros terms, a row per document)
取而代之的語料庫,我想從這個文件中創建DocumentTermMatrix(TM封裝),是這可能嗎?
Cluto File:
2 3 3
1 3 3 4
2 8
Row File:
car
plane
Column File:
x
y
z
解決方案:
dtm = as.DocumentTermMatrix(read_stm_CLUTO(file), weightTf);
rows <- scan("rows.txt", what="", sep="\n");
columns <- scan("columns.txt", what="", sep="\n");
dtm$dimnames = list(rows,columns);
這個怎麼樣? '需要(SLAM); as.DocumentTermMatrix(read_stm_CLUTO(file),weightTf)' – Ben
@Ben Perfect,你可以輸入它作爲答案,所以我可以接受它。有什麼方法可以傳遞行和列名嗎? – metdos