我想從下面的腳本中給出的示例文本中獲取NN或NNS。爲此,當我用下面的代碼,輸出爲:如何從文本中獲取NN和NNS?
types
synchronization
phase
synchronization
-RSB-
synchronization
-LSB-
-RSB-
projection
synchronization
這裏爲什麼我收到[-RSB-]
或[-LSB-]
?我應該使用不同的模式來同時獲取NN或NNS嗎?
atic = "So far, many different types of synchronization have been investigated, such as complete synchronization [8], generalized synchronization [9], phase synchronization [10], lag synchronization [11], projection synchronization [12, 13], and so forth.";
Reader reader = new StringReader(atic);
DocumentPreprocessor dp = new DocumentPreprocessor(reader);
docs_terms_unq.put(rs.getString("u"), new ArrayList<String>());
docs_terms.put(rs.getString("u"), new ArrayList<String>());
for (List<HasWord> sentence : dp) {
List<TaggedWord> tagged = tagger.tagSentence(sentence);
GrammaticalStructure gs = parser.predict(tagged);
Tree x = parserr.parse(sentence);
System.out.println(x);
TregexPattern NPpattern = TregexPattern.compile("@NN|NNS");
TregexMatcher matcher = NPpattern.matcher(x);
while (matcher.findNextMatchingNode()) {
Tree match = matcher.getMatch();
ArrayList hh = match.yield();
Boolean b = false;
System.out.println(hh.toString());}
非常感謝你。我確實已經意識到,當使用我以前的方法時,它減少了名詞的數量! –
一個小問題。當需要NP時,我是否應該使用相同的方法?對於我發佈的那個,我使用它像TregexPattern.compile(「@ NP!<< @NP」)。那麼我可以使用partOfSpeechTag.equals(「@ NP!<< @NP」)嗎? –