2012-06-27 56 views
0

我想寫一個正則表達式,在java中匹配單詞和連字符的單詞。到目前爲止,我有:正則表達式來捕獲行和非連字詞之間的連字符

Pattern p1 = Pattern.compile("\\w+(?:-\\w+)",Pattern.CASE_INSENSITIVE); 
Pattern p2 = Pattern.compile("[a-zA-Z0-9]+",Pattern.CASE_INSENSITIVE); 
Pattern p3 = Pattern.compile("(?<=\\s)[\\w]+-$",Pattern.CASE_INSENSITIVE | Pattern.DOTALL); 

這是我的測試案例:

 
    Programs 
    Dsfasdf. Programs Programs Dsfasdf. Dsfasdf. as is wow woah! woah. woah? okay. 
    he said, "hi." aasdfa. wsdfalsdjf. go-to go- 
to 
asdfasdf.. , : ; " ' () ? ! -/\ @ # $ % &^~ ` * [ ] { } + _ 123

任何幫助將是真棒

我預期的結果將是匹配所有的話IE瀏覽器。

 
Programs Dsfasdf Programs Programs Dsfasdf Dsfasdf 
as is wow woah woah woah okay he said hi aasdfa 
wsdfalsdjf go-to go-to asdfasdf

我正在努力的部分是匹配行之間分開的單詞作爲一個單詞。

即。

go- 
to 
+0

所以,你想匹配的話,無論有無連字符?但不是標點符號或數字?你的測試用例的預期結果也是有用的。 –

+0

我不清楚你的意思是「連字符*之間的連字符*」。 – Junuxx

回答

2
 
\p{L}+(?:-\n?\p{L}+)* 
\ /^\ /^\ /\ /^^^ 
\/| | | | \/||| 
    | | | | | | ||`- Previous can repeat 0 or more times (group of literal '-', optional new-line and one or more of any letter (upper/lower case)) 
    | | | | | | |`-- End first non-capture group 
    | | | | | | `--- Match one or more of previous (any letter, upper/lower case) 
    | | | | | `------ Match any letter (upper/lower case) 
    | | | | `---------- Match a single new-line (optional because of `?`) 
    | | | `------------ Literal '-' 
    | | `-------------- Start first non-capture group 
    | `---------------- Match one or more of previous (any letter between A-Z (upper/lower case)) 
    `------------------- Match any letter (upper/lower case) 

Is this OK?

+4

非常酷的ascii「藝術」... –

+0

是自動生成的嗎?如果是這樣,這是從哪裏來的? –

+0

手工製作的自動駕駛儀! – ohaal

1

我會去與正則表達式:

\p{L}+(?:\-\p{L}+)* 

這樣的正則表達式應該也匹配的話「未婚夫」「A點菜」和其他詞包含一些特殊類別的「字母」字符。 \p{L}與類別「字母」中的單個代碼點匹配。

相關問題