5
我試圖找到一種基於POS標記來否定句子的方法。請考慮:使用POS標記來否定句子
include_once 'class.postagger.php';
function negate($sentence) {
$tagger = new PosTagger('includes/lexicon.txt');
$tags = $tagger->tag($sentence);
foreach ($tags as $t) {
$input[] = trim($t['token']) . "/" . trim($t['tag']) . " ";
}
$sentence = implode(" ", $input);
$postagged = $sentence;
// Concatenate "not" to every JJ, RB or VB
// Todo: ignore negative words (not, never, neither)
$sentence = preg_replace("/(\w+)\/(JJ|MD|RB|VB|VBD|VBN)\b/", "not$1/$2", $sentence);
// Remove all POS tags
$sentence = preg_replace("/\/[A-Z$]+/", "", $sentence);
return "$postagged<br>$sentence";
}
BTW:在這個例子中,我使用了POS-tagging implementation和伊恩·巴伯的lexicon。這段代碼運行的一個例子是:
echo negate("I will never go to their place again");
I/NN will/MD never/RB go/VB to/TO their/PRP$ place/NN again/RB
I notwill notnever notgo to their place notagain
正如你所看到的,(這個問題也評論中的代碼),否定詞本身被否定爲WEL:never
成爲notnever
,這顯然不該」不會發生。由於我的正則表達式技能不是全部,有沒有辦法從使用的正則表達式中排除這些單詞?
[編輯]另外,我也非常歡迎其他的意見/批評你可能在這個否定的實現,因爲我敢肯定,這(仍)相當有缺陷的:-)
http://stackoverflow.com/questions/2633353/algorithm-for-negating-sentences –