分類文本根據相似性不同類別

我正在使用自然句建模成類非常大的文檔{新聞+文章}，請看下面的例子：分類文本根據相似性不同類別

1- The System enables a user to shut down the server remotely ==> class 1 

2- The Application allows a customer to to close the machine online ==> (must be also) class 1 , why ?

，因爲這兩個句子有許多相似的同義詞{系統〜應用程序，啓用〜允許，用戶〜客戶，關閉〜關閉，服務器〜機器，遠程〜在線} 所以我正在做一些數據的分類器訓練取決於相似性規則或詞的同義詞+可能（簡化）最多的規則，我們可以得到最多的結果。

所以問題什麼是最好的策略來配置/調整分類器的想法？謝謝你提前

來源

2015-08-28 Fawzi Belal

任何答案請 –

檢出：radimrehurek.com/gensim/models/doc2vec.html – alvas

你看過這個嗎？

Is there an algorithm that tells the semantic similarity of two phrases

最重要的是確定相似性是指。如果你這樣做，選擇一個分類器是這個任務的簡單部分（ID3，C4.5，袋字，樸素貝葉斯等）。

來源

2015-09-01 08:35:08 rpd

分類文本根據相似性不同類別

回答

相關問題