如何在java中查找字符串中的整個單詞

我有一個字符串，必須爲不同的關鍵字解析。例如，我有字符串：如何在java中查找字符串中的整個單詞

「我會與你相約在123woods」

而且我的關鍵字

「123woods」「樹林」

我應該報告每當我有一場比賽，並在哪裏。還應該考慮多次事件。然而，對於這一場比賽，我只能在123伍茲比賽中得到一場比賽，而不是在森林中。這消除了使用String.contains（）方法。此外，我應該可以有一個關鍵字列表/一組關鍵字，並同時檢查它們的發生。在這個例子中，如果我有'123woods'和'come'，我應該得到兩個事件。在大文本上執行方法應該有點快。

我的想法是使用StringTokenizer，但我不確定它是否會表現良好。有什麼建議麼？

來源

2011-02-23 Nikola Yovchev

你確定邏輯沒有缺陷嗎？如果您有關鍵字 - words123和123words，該怎麼辦？那麼在文字中的單詞是誰的比賽？ – 2011-02-23 12:48:10

無。我只需要確切的單詞匹配。 – 2011-02-23 13:18:15

以下示例基於您的意見。它使用關鍵字列表，將使用字邊界在給定的字符串中進行搜索。它使用Apache Commons Lang中的StringUtils來構建正則表達式並打印匹配的組。

String text = "I will come and meet you at the woods 123woods and all the woods"; 

List<String> tokens = new ArrayList<String>(); 
tokens.add("123woods"); 
tokens.add("woods"); 

String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b"; 
Pattern pattern = Pattern.compile(patternString); 
Matcher matcher = pattern.matcher(text); 

while (matcher.find()) { 
    System.out.println(matcher.group(1)); 
}

如果你正在尋找更多的性能，你可以看看StringSearch：Java中的高性能模式匹配算法。

來源

2011-02-23 12:50:43 Chris

如果我有一個ArrayList 而我想用一個模式來構建它呢？好像我必須使用可靠的舊StringBuilder？ – 2011-02-23 13:06:12

@baba - 你可以這樣做，或者你可以迭代List <>。我不確定哪個更有效率，如果性能是一個問題，你可能想嘗試兩種方法。 – 2011-02-23 13:12:55

我個人更喜歡遍歷列表。我的答案中增加了這個選項。 – Chris 2011-02-23 13:30:25

您可以使用正則表達式。使用匹配器和模式方法來獲得所需的輸出

來源

2011-02-23 12:49:09 Deepak

如何像Arrays.asList(String.split(" ")).contains("xx")？

參見String.split()和How can I test if an array contains a certain value。

來源

2011-02-23 12:50:35

您還可以使用正則表達式匹配與\ B標誌（整個單詞邊界）。

來源

2011-02-23 12:51:21

嘗試使用正則表達式進行匹配。匹配「\ b123wood \ b」，\ b是單詞分隔符。

來源

2011-02-23 12:51:38 Axel

當別人回答時，使用正則表達式+字邊界。

"I will come and meet you at the 123woods".matches(".*\\b123woods\\b.*");

將成立。

"I will come and meet you at the 123woods".matches(".*\\bwoods\\b.*");

將是錯誤的。

來源

2011-02-23 12:56:34 morja

希望這對你的作品：

String string = "I will come and meet you at the 123woods"; 
String keyword = "123woods"; 

Boolean found = Arrays.asList(string.split(" ")).contains(keyword); 
if(found){ 
     System.out.println("Keyword matched the string"); 
}

http://codigounico.blogspot.com/

來源

2011-02-23 14:02:15 LeonardoPolitec

一個更簡單的方式做到這一點是使用分裂（）：

String match = "123woods"; 
String text = "I will come and meet you at the 123woods"; 

String[] sentence = text.split(); 
for(String word: sentence) 
{ 
    if(word.equals(match)) 
     return true; 
} 
return false;

這是一個更簡單，更優雅不使用代幣等做同樣的事情的方法等。

來源

2012-10-11 00:12:48 ulu5

雖然比較容易理解和寫，但這並不是我問的問題的答案。我有兩個或三個，或者可能是無限數量的「匹配」關鍵字，我需要獲取在「文本」中找到的那些關鍵字。當然，你可能會在分割文本上爲每個「單詞」循環我的「匹配」關鍵字，但是我發現它比已經接受的解決方案更不優雅。 – 2012-10-11 07:55:18

爲了匹配「123woods」而不是在「森林」中，在正則表達式中使用原子分組。有一點需要注意的是，在一個匹配「123woods」的字符串中，它將匹配第一個「123woods」並退出，而不是進一步搜索相同的字符串。

\b(?>123woods|woods)\b

它搜索123woods作爲主搜索，一旦匹配它退出搜索。

來源

2013-08-31 13:00:55 SasiRSK

在Android中得到了一個辦法比賽確切的詞從字符串：

String full = "Hello World. How are you ?"; 

String one = "Hell"; 
String two = "Hello"; 
String three = "are"; 
String four = "ar"; 


boolean is1 = isContainExactWord(full, one); 
boolean is2 = isContainExactWord(full, two); 
boolean is3 = isContainExactWord(full, three); 
boolean is4 = isContainExactWord(full, four); 

Log.i("Contains Result", is1+"-"+is2+"-"+is3+"-"+is4); 

Result: false-true-true-false

的匹配詞功能：

private boolean isContainExactWord(String fullString, String partWord){ 
    String pattern = "\\b"+partWord+"\\b"; 
    Pattern p=Pattern.compile(pattern); 
    Matcher m=p.matcher(fullString); 
    return m.find(); 
}

完成

來源

2015-07-07 10:51:42

回首在原來的問題，我們需要在給定的句子中找到一些給定的關鍵字，計算出現次數並知道在哪裏。我不太明白「where」是什麼意思（這是句中的索引嗎？），所以我會通過那個...我仍然在學習java，一次一步，所以我會看到在適當的時間:-)

必須注意，常見的句子（作爲原問題中的一個）可以有重複的關鍵字，因此，搜索不能只是問一個給定的關鍵字是否存在和如果它存在，則將其計爲1。可以有更多的相同。例如：

// Base sentence (added punctuation, to make it more interesting): 
String sentence = "Say that 123 of us will come by and meet you, " 
       + "say, at the woods of 123woods."; 

// Split it (punctuation taken in consideration, as well): 
java.util.List<String> strings = 
         java.util.Arrays.asList(sentence.split(" |,|\\.")); 

// My keywords: 
java.util.ArrayList<String> keywords = new java.util.ArrayList<>(); 
keywords.add("123woods"); 
keywords.add("come"); 
keywords.add("you"); 
keywords.add("say");

通過觀察它，預期的結果將是5「說」 +「來」 +「你」 +「表示」 +「123woods」計數「說」兩次，如果我們去小寫。如果我們不這樣做，那麼計數應該是4，「說」被排除在外並且「說」包括在內。精細。我的建議是：

// Set... ready...? 
int counter = 0; 

// Go! 
for(String s : strings) 
{ 
    // Asking if the sentence exists in the keywords, not the other 
    // around, to find repeated keywords in the sentence. 
    Boolean found = keywords.contains(s.toLowerCase()); 
    if(found) 
    { 
     counter ++; 
     System.out.println("Found: " + s); 
    } 
} 

// Statistics: 
if (counter > 0) 
{ 
    System.out.println("In sentence: " + sentence + "\n" 
        + "Count: " + counter); 
}

而且結果是：

發現：說
發現：來
發現：你
發現：說
發現：123woods
在一句：餵我們中的123人會在123woods的樹林裏過來見你。
次數：5

來源

2015-07-13 23:54:14

的解決方案似乎是早就接受了，但解決的辦法可以改善，因此，如果有人有類似的問題：

這是多模式 - 搜索 - 一個經典的應用算法。

Java模式搜索（與Matcher.find）沒有資格這樣做。在java中優化搜索恰好一個關鍵字，搜索or表達式使用正則表達式非確定性自動機，它是在不匹配時回溯的。在更糟糕的情況下，文本的每個字符將被處理l次（其中l是模式長度的總和）。

單一模式搜索更好，但不合格。人們必須開始搜索每個關鍵字模式。在更糟的情況下，文本中的每個字符將被處理p次，其中p是模式的數量。

多模式搜索將會精確處理文本的每個字符一次。適合這種搜索的算法將是Aho-Corasick，Wu-Manber或Set Backwards Oracle Matching。這些可以在像Stringsearchalgorithms或byteseek這樣的庫中找到。

// example with StringSearchAlgorithms 

AhoCorasick stringSearch = new AhoCorasick(asList("123woods", "woods")); 

CharProvider text = new StringCharProvider("I will come and meet you at the woods 123woods and all the woods", 0); 

StringFinder finder = stringSearch.createFinder(text); 

List<StringMatch> all = finder.findAll();

來源

2016-08-13 10:22:39 CoronA

如何在java中查找字符串中的整個單詞

回答

相關問題