2017-01-08 76 views
0

此代碼可以提取包含特定單詞的句子。問題是如果我想根據不同的詞語提取幾個句子,我必須複製它幾次。有幾種方法可以做到這一點嗎?可能會給它一個陣列?提取包含特定單詞的字符串

String o = "Trying to extract this string. And also the one next to it.";  
String[] sent = o.split("\\."); 
List<String> output = new ArrayList<String>(); 
for (String sentence : sent) { 
    if (sentence.contains("this")) { 
     output.add(sentence); 
    } 
}  
System.out.println(">>output=" + output); 
+1

你的代碼中有幾個問題。 'substring'方法出現兩次並嵌套。請嘗試解決該問題。另外,在這種情況下,「哇」是未知的。我很抱歉,但我沒有得到你想要做的事......你有一個字符串''擁有'',你用空格分割給你'{「擁有」}'而你沒有使用數組最後 – torkleyy

+0

對不起。我發佈的代碼,我一直在搞... ... –

+0

所以你的問題是,如果你可以有多個單詞,如果其中一個單詞存在於句子中應該提取句子? – torkleyy

回答

0

你可以試試這個:

String o = "Trying to extract this string. And also the one next to it."; 
String[] sent = o.split("\\."); 
List<String> keyList = new ArrayList<String>(); 
keyList.add("this"); 
keyList.add("these"); 
keyList.add("that"); 

List<String> output = new ArrayList<String>(); 

for (String sentence : sent) { 
    for (String key : keyList) { 
     if (sentence.contains(key)) { 
      output.add(sentence); 
      break; 
     } 
    } 
} 
System.out.println(">>output=" + output); 
0
String sentence = "First String. Second Int. Third String. Fourth Array. Fifth Double. Sixth Boolean. Seventh String"; 
List<String> output = new ArrayList<String>(); 

for(String each: sentence.split("\\.")){ 
    if(inKeyword(each)) output.add(each); 
} 

System.out.println(output); 

輔助功能:

public static Boolean inKeyword(String currentSentence){ 
    String[] keyword = {"int", "double"}; 

    for(String each: keyword){ 
     if(currentSentence.toLowerCase().contains(each)) return true; 
    } 

    return false; 
} 
0

如果你有一個單詞列表名爲filter過濾和句子的數組你可以使用Collections.disjoint來比較該句子的單詞是否與要過濾的單詞不重疊。可悲的是,如果您過濾"However"並且您的句子包含"However,",則這不起作用。

Collection<String> filter = /**/; 
String[] sentences = /**/; 
List<String> result = new ArrayList(); 
for(String sentence : sentences) { 
    Collection<String> words = Arrays.asList(sentence.split(" ")); 
    // If they do not not overlap, they overlap 
    if (!Collections.disjoint(words, filter)) { 
     result.add(sentence); 
    }   
} 
0

有了流(分裂成句子和詞):

String o = "Trying to extract this string. And also the one next to it."; 
    Set<String> words = new HashSet<>(Arrays.asList("this", "also")); 

    List<String> output = Arrays.stream(o.split("\\.")).filter(
      sentence -> Arrays.stream(sentence.split("\\s")).anyMatch(
        word -> words.contains(word) 
      ) 
    ).collect(Collectors.toList()); 

    System.out.println(">>output=" + output); 
0

您可以使用String.matches如下。

String sentence = ...; 
if (sentence.matches(".*(you|can|use).*")) { // Or: 
if (sentence.matches(".*\\b(you|can|use)\\b.*")) { // With word boundaries 

if (sentence.matches("(?i).*(you|can|use).*")) { // Case insensitive ("You") 

在java中8以下變化可能會做:

String pattern = ".*(you|can|use).*"; 

String pattern = new StringJoiner("|", ".*(", ").*) 
    .add("you") 
    .add("can") 
    .add("use") 
    .toString(); 
// Or a stream on the words with a joining collector 

Arrays.stream(o.split("\\.\\s*")) 
    filter(sentence -> sentence.matches(pattern)) 
    forEach(System.out::println); 
相關問題