2013-03-27 69 views
0

我試圖找到一個句子是否是在下面的步驟正或負:情感分析(SentiWordNet) - 判斷一個句子的上下文

1)檢索語音的零件(動詞,名詞,形容詞等)從句子中使用斯坦福NLP解析器。

2.)使用SentiWordNet查找與每個詞類相關的正值和負值。

3)求和得到計算相關的句子淨正淨負值正值和負值。

但問題是,SentiWordNet基於不同的感官/背景返回正/負值列表。是否可以將特定句子與詞性一起傳遞給SentiWordNet解析器,以便它可以自動判斷感知/上下文,並返回只有一對正值和負值?

或者是否有任何其他替代解決方案來解決這個問題?

謝謝。

回答

1

我們可以將pos傳遞給sentiwordnet解析器。 下載模式Python模塊

from pattern.en import wordnet 

print wordnet.synsets("kill",pos="VB")[0].weight 

wordnet.synsets返回同義集 的列表,並從我們選擇第一個項目 輸出將是(極性,主觀性) 希望這有助於一個元組...

2

SentoWordNet Demo Code 這可能會幫助你。

// Copyright 2013 Petter Törnberg 
// 
// This demo code has been kindly provided by Petter Törnberg <[email protected]> 
// for the SentiWordNet website. 
// 
// This program is free software: you can redistribute it and/or modify 
// it under the terms of the GNU General Public License as published by 
// the Free Software Foundation, either version 3 of the License, or 
// (at your option) any later version. 
// 
// This program is distributed in the hope that it will be useful, 
// but WITHOUT ANY WARRANTY; without even the implied warranty of 
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 
// GNU General Public License for more details. 
// 
// You should have received a copy of the GNU General Public License 
// along with this program. If not, see <http://www.gnu.org/licenses/>. 

import java.io.BufferedReader; 
import java.io.FileReader; 
import java.io.IOException; 
import java.util.HashMap; 
import java.util.Map; 

public class SentiWordNetDemoCode { 

    private Map<String, Double> dictionary; 

    public SentiWordNetDemoCode(String pathToSWN) throws IOException { 
     // This is our main dictionary representation 
     dictionary = new HashMap<String, Double>(); 

     // From String to list of doubles. 
     HashMap<String, HashMap<Integer, Double>> tempDictionary = new HashMap<String, HashMap<Integer, Double>>(); 

     BufferedReader csv = null; 
     try { 
      csv = new BufferedReader(new FileReader(pathToSWN)); 
      int lineNumber = 0; 

      String line; 
      while ((line = csv.readLine()) != null) { 
       lineNumber++; 

       // If it's a comment, skip this line. 
       if (!line.trim().startsWith("#")) { 
        // We use tab separation 
        String[] data = line.split("\t"); 
        String wordTypeMarker = data[0]; 

        // Example line: 
        // POS ID PosS NegS SynsetTerm#sensenumber Desc 
        // a 00009618 0.5 0.25 spartan#4 austere#3 ascetical#2 
        // ascetic#2 practicing great self-denial;...etc 

        // Is it a valid line? Otherwise, through exception. 
        if (data.length != 6) { 
         throw new IllegalArgumentException(
           "Incorrect tabulation format in file, line: " 
             + lineNumber); 
        } 

        // Calculate synset score as score = PosS - NegS 
        Double synsetScore = Double.parseDouble(data[2]) 
          - Double.parseDouble(data[3]); 

        // Get all Synset terms 
        String[] synTermsSplit = data[4].split(" "); 

        // Go through all terms of current synset. 
        for (String synTermSplit : synTermsSplit) { 
         // Get synterm and synterm rank 
         String[] synTermAndRank = synTermSplit.split("#"); 
         String synTerm = synTermAndRank[0] + "#" 
           + wordTypeMarker; 

         int synTermRank = Integer.parseInt(synTermAndRank[1]); 
         // What we get here is a map of the type: 
         // term -> {score of synset#1, score of synset#2...} 

         // Add map to term if it doesn't have one 
         if (!tempDictionary.containsKey(synTerm)) { 
          tempDictionary.put(synTerm, 
            new HashMap<Integer, Double>()); 
         } 

         // Add synset link to synterm 
         tempDictionary.get(synTerm).put(synTermRank, 
           synsetScore); 
        } 
       } 
      } 

      // Go through all the terms. 
      for (Map.Entry<String, HashMap<Integer, Double>> entry : tempDictionary 
        .entrySet()) { 
       String word = entry.getKey(); 
       Map<Integer, Double> synSetScoreMap = entry.getValue(); 

       // Calculate weighted average. Weigh the synsets according to 
       // their rank. 
       // Score= 1/2*first + 1/3*second + 1/4*third ..... etc. 
       // Sum = 1/1 + 1/2 + 1/3 ... 
       double score = 0.0; 
       double sum = 0.0; 
       for (Map.Entry<Integer, Double> setScore : synSetScoreMap 
         .entrySet()) { 
        score += setScore.getValue()/(double) setScore.getKey(); 
        sum += 1.0/(double) setScore.getKey(); 
       } 
       score /= sum; 

       dictionary.put(word, score); 
      } 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } finally { 
      if (csv != null) { 
       csv.close(); 
      } 
     } 
    } 

    public double extract(String word, String pos) { 
     return dictionary.get(word + "#" + pos); 
    } 

    public static void main(String [] args) throws IOException { 
     if(args.length<1) { 
      System.err.println("Usage: java SentiWordNetDemoCode <pathToSentiWordNetFile>"); 
      return; 
     } 

     String pathToSWN = args[0]; 
     SentiWordNetDemoCode sentiwordnet = new SentiWordNetDemoCode(pathToSWN); 

     System.out.println("good#a "+sentiwordnet.extract("good", "a")); 
     System.out.println("bad#a "+sentiwordnet.extract("bad", "a")); 
     System.out.println("blue#a "+sentiwordnet.extract("blue", "a")); 
     System.out.println("blue#n "+sentiwordnet.extract("blue", "n")); 
    } 
} 
+0

請問您是否可以詳細說明函數sentiwordnet.extract()返回的值(可能是一個例子)? – 2018-01-12 14:03:56