2013-12-03 64 views
9

我下載了斯坦福核心nlp軟件包並試圖在我的機器上測試它。執行和測試斯坦福核心nlp示例

使用命令:java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt

我在positivenegative形式情緒的結果。 input.txt包含要測試的句子。

更多的命令:java -cp stanford-corenlp-3.3.0.jar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file input.txt在執行時給follwing線:

H:\Drive E\Stanford\stanfor-corenlp-full-2013~>java -cp stanford-corenlp-3.3.0.j 
ar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford 
.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file 
input.txt 
Adding annotator tokenize 
Adding annotator ssplit 
Adding annotator pos 
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3wo 
rds/english-left3words-distsim.tagger ... done [36.6 sec]. 
Adding annotator lemma 
Adding annotator parse 
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCF 
G.ser.gz ... done [13.7 sec]. 

Ready to process: 1 files, skipped 0, total 1 
Processing file H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt ... wri 
ting to H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt.xml { 
    Annotating file H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt [13.6 
81 seconds] 
} [20.280 seconds] 
Processed 1 documents 
Skipped 0 documents, error annotating 0 documents 
Annotation pipeline timing information: 
PTBTokenizerAnnotator: 0.4 sec. 
WordsToSentencesAnnotator: 0.0 sec. 
POSTaggerAnnotator: 1.8 sec. 
MorphaAnnotator: 2.2 sec. 
ParserAnnotator: 9.1 sec. 
TOTAL: 13.6 sec. for 10 tokens at 0.7 tokens/sec. 
Pipeline setup: 58.2 sec. 
Total time for StanfordCoreNLP pipeline: 79.6 sec. 

H:\Drive E\Stanford\stanfor-corenlp-full-2013~> 

能理解。無信息結果。

我有一個例子在:stanford core nlp java output

import java.io.*; 
import java.util.*; 

import edu.stanford.nlp.io.*; 
import edu.stanford.nlp.ling.*; 
import edu.stanford.nlp.pipeline.*; 
import edu.stanford.nlp.trees.*; 
import edu.stanford.nlp.util.*; 

public class StanfordCoreNlpDemo { 

    public static void main(String[] args) throws IOException { 
    PrintWriter out; 
    if (args.length > 1) { 
     out = new PrintWriter(args[1]); 
    } else { 
     out = new PrintWriter(System.out); 
    } 
    PrintWriter xmlOut = null; 
    if (args.length > 2) { 
     xmlOut = new PrintWriter(args[2]); 
    } 

    StanfordCoreNLP pipeline = new StanfordCoreNLP(); 
    Annotation annotation; 
    if (args.length > 0) { 
     annotation = new Annotation(IOUtils.slurpFileNoExceptions(args[0])); 
    } else { 
     annotation = new Annotation("Kosgi Santosh sent an email to Stanford University. He didn't get a reply."); 
    } 

    pipeline.annotate(annotation); 
    pipeline.prettyPrint(annotation, out); 
    if (xmlOut != null) { 
     pipeline.xmlPrint(annotation, xmlOut); 
    } 
    // An Annotation is a Map and you can get and use the various analyses individually. 
    // For instance, this gets the parse tree of the first sentence in the text. 
    List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class); 
    if (sentences != null && sentences.size() > 0) { 
     CoreMap sentence = sentences.get(0); 
     Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class); 
     out.println(); 
     out.println("The first sentence parsed is:"); 
     tree.pennPrint(out); 
    } 
    } 

} 

試圖在NetBeans與包括必要的庫執行它。但它總是夾在其間或給予例外Exception in thread 「main」 java.lang.OutOfMemoryError: Java heap space

禰我設置的內存來property/run/VM box

任何想法來分配我如何可以使用命令行運行上面的Java例子嗎?

我想要得到的例子

UPDATE

輸出的景氣指數:java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt

enter image description here

出來放:java -cp stanford-corenlp-3.3.0.j ar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford .nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file input.txt

Out of of above command

+0

在你的一個例子中產生了什麼? (即「H:\ Drive E \ Stanford \ stanfor-corenlp-full-2013〜\ input.txt.xml」) –

+0

@home:'OutOfMemoryError'我已經在Web上解決了這個問題,仍然同樣的錯誤仍然存​​在 – user123

+0

@ElliottFrisch:請參閱我更新了問題 – user123

回答

14

您需要的「情緒」註釋添加到註釋的名單:

-annotators tokenize,ssplit,pos,lemma,parse,sentiment 

這將是「情緒」屬性添加到每個句子節點在您的XML。

+0

我發現了這個例子,其中沒有所有6個註釋器用於執行情感分析,但只有4. POS和引理不包括在內,它如何影響結果?例如:https://blog.openshift.com/day-20-stanford-corenlp-performing-sentiment-analysis-of-twitter-using-java/ –

3

根據示例here您需要運行輿情分析。

java -cp "*" -mx5g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt 

顯然這是一個內存昂貴的操作,它可能不會完成只有1千兆字節。 然後你可以使用「評估工具」

java -cp "*" edu.stanford.nlp.sentiment.Evaluate edu/stanford/nlp/models/sentiment/sentiment.ser.gz input.txt 
+1

Elliott:你說得對,但是我按照我的系統配置使用了'-mx1g',並且在命令行中它也起作用 – user123

+0

您是否具有測試Stanford coreNLP情感部分的經驗? – user123

+0

不,我有一個商業情緒引擎的經驗,你在那個圖像中有情緒分數。 –

20

你可以在你的代碼如下:

String text = "I am feeling very sad and frustrated."; 
Properties props = new Properties(); 
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment"); 
StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 
<...> 
Annotation annotation = pipeline.process(text); 
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class); 
for (CoreMap sentence : sentences) { 
    String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class); 
    System.out.println(sentiment + "\t" + sentence); 
} 

它會打印出句子的情緒和句子本身,例如「我感到非常傷心和沮喪。「:

Negative I am feeling very sad and frustrated. 
+0

你在哪裏傳遞輸入句子在程序中? – Naveen

+0

已添加到示例Annotation annotation = pipeline.process(text); – saganas

1

對我來說這是工作的罰款 -

Maven依賴:

 <dependency> 
      <groupId>edu.stanford.nlp</groupId> 
      <artifactId>stanford-corenlp</artifactId> 
      <version>3.5.2</version> 
      <classifier>models</classifier> 
     </dependency> 
     <dependency> 
      <groupId>edu.stanford.nlp</groupId> 
      <artifactId>stanford-corenlp</artifactId> 
      <version>3.5.2</version> 
     </dependency> 
     <dependency> 
      <groupId>edu.stanford.nlp</groupId> 
      <artifactId>stanford-parser</artifactId> 
      <version>3.5.2</version> 
     </dependency> 

Java代碼:

public static void main(String[] args) throws IOException { 
     String text = "This World is an amazing place"; 
     Properties props = new Properties(); 
     props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment"); 
     StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 

     Annotation annotation = pipeline.process(text); 
     List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class); 
     for (CoreMap sentence : sentences) { 
      String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class); 
      System.out.println(sentiment + "\t" + sentence); 
     } 
    } 

結果:

非常積極這世界是一個amazi ng place