2012-05-12 38 views
12

所有人。Java按空格,新行,標籤,標點符號拆分字符串

我有這樣

String message = "This is the new message or something like that, OK"; 

一個字符串,我想把它分成數組

String[] dic = {"this", "is", "the", "new", "message", "or", "something", "like", "that", "OK"}; 

我用

message = message.split("\\s+"); 

的問題是,它包含了「中指出, 「不」,「就像我想要的。請教我如何解決它。由於

+0

可能重複[分割字符串通過在Java中的標點符號和空格等正則表達式(http://stackoverflow.com/ questions/7384791/splitting-strings-through-regular-expressions-by-punctuation-and-whitespace-etc) – assylias

回答

24

你可以做

String[] dic = message.split("\\W+"); 

\\W表示不字母數字字符。

+0

謝謝大家!我選擇Garrett Hall的答案。 –

3

使用Guava

// define splitter as a constant 
private static final Splitter SPLITTER = 
Splitter.on(CharMatcher.WHITESPACE.or(CharMatcher.is(',')) 
     .trimResults() 
     .omitEmptyStrings(); 
// ... 

// and now use it in your code 
String[] str = Iterables.toArray(SPLITTER.split(yourString), String.class); 
+0

查看標題,目標是刪除所有標點符號,而不僅僅是',' – assylias

+0

@assylias ok,那麼它將是'Splitter.on(CharMatcher.JAVA_LETTER.negate())。trimResults()。omitEmptyString()' –

3

您可以使用StringTokenizer

String message = "This is the new message or something like that, OK"; 
String delim = " \n\r\t,.;"; //insert here all delimitators 
StringTokenizer st = new StringTokenizer(message,delim); 
while (st.hasMoreTokens()) { 
    System.out.println(st.nextToken()); 
} 
+0

這工作非常感謝的人! –

相關問題