2011-01-20 135 views
3

我想在我的java程序中使用正則表達式來識別我的字符串的某些功能。 我有這種類型的字符串:複雜的正則表達式

`-Author-已經寫了(-hh - : - MM-)

因此,舉例來說,我有一個字符串:

切科已經寫了(15:12)

,我已經提取作者,HH和MM領域。很顯然,我已經有一定的約束考慮:

hh and mm must be numbers

author hasn't any restrictions

I've to consider space between "has wrote" and (

我不知道我該如何使用正則表達式,你能幫幫我嗎?

編輯:我附上我的片斷:

  String mRegex = "(\\s)+ has wrote \\((\\d\\d):(\\d\\d)\\)"; 
      Pattern mPattern = Pattern.compile(mRegex); 

      String[] str = { 
       "Cecco CQ has wrote (14:55)", //OK (matched) 
       "yesterday you has wrote that I'm crazy", //NO (different text) 
       "Simon has wrote (yesterday)", // NO (yesterday isn't numbers) 
       "John has wrote (22:32)", //OK 
       "James has wrote(22:11)", //NO (missed space between has wrote and() 
       "Tommy has wrote (xx:ss)" //NO (xx and ss aren't numbers) 
      }; 

      for(String s : str) { 
       Matcher mMatcher = mPattern.matcher(s); 
       while (mMatcher.find()) { 
        System.out.println(mMatcher.group()); 
       } 
      } 
+1

找到「已經寫了」?你會想要放棄「有」 - 增加的好處:「你」和「我」也將除了實際的名字工作。 – 2011-01-20 10:46:58

回答

2

功課?

喜歡的東西:

(.+) has wrote \((\d\d):(\d\d)\) 

應該做的伎倆

  • () - 標記組拍攝(有上述3)
  • .+ - 任何字符(你說沒有任何限制)
  • \d - 任意數字
  • \(\)逃避括號的文字,而不是捕獲組

使用方法:在你需要開始使用一些深色的正則表達式巫術結束:

Pattern p = Pattern.compile("(.+) has wrote \\((\\d\\d):(\\d\\d)\\)"); 

Matcher m = p.matcher("Gareth has wrote (12:00)"); 

if(m.matches()){ 
    System.out.println(m.group(1)); 
    System.out.println(m.group(2)); 
    System.out.println(m.group(3)); 
} 

爲了配合可選的(毫米HH) :

Pattern p = Pattern.compile("(.+) has wrote\\s?(?:\\((\\d\\d):(\\d\\d)\\))?"); 

Matcher m = p.matcher("Gareth has wrote (12:00)"); 

if(m.matches()){ 
    System.out.println(m.group(1)); 
    System.out.println(m.group(2)); 
    System.out.println(m.group(3)); 
} 

m = p.matcher("Gareth has wrote"); 
if(m.matches()){  
    System.out.println(m.group(1)); 
    // m.group(2) == null since it didn't match anything 
} 

新轉義模式:

(.+) has wrote\s?(?:\((\d\d):(\d\d)\))? 
  • \s?任選匹配的空間(可能沒有在端部的空間,如果不存在(HH:MM)組
  • (?: ...)是一個無捕獲組,即允許使用把?後,使可選

我認爲@codinghorror有something to say about regex

+0

非常感謝。但是,如果我遇到這種情況:「Gareth寫道」沒有()參數,但有endline,我如何修改我的模式字符串? – CeccoCQ 2011-01-20 11:03:34

0

好,以防萬一你不知道,Matcher有一個很好的功能,可以繪製出特定羣體,或由(),Matcher.group(int)包圍的圖案部分。就像如果我想匹配兩個分號像之間的數字:

:22:

我可以使用正則表達式":(\\d+):"到兩個分號之間的匹配一個或多個數字,然後我可以明確的獲取數字與:

Matcher.group(1)

然後它只是解析字符串轉換成int類型的問題。請注意,團體編號從開始。組(0)是全場比賽,所以Matcher.group(0),在前面的例子將返回:22:

對於你的情況,我想你需要考慮的正則表達式位

  • "[A-Za-z]"(對於字母字符,您可能也可以安全使用"\\w",其中匹配字母字符以及數字和_)。
  • "\\d"數字(1,2,3 ...)
  • "+"用於指示您想要一個或多個以前的字符或組。
1

找出正則表達式的最簡單方法是在編碼之前使用測試工具。
我使用http://www.brosinski.com/regex/

使用這個Eclipse插件,我想出了以下結果:

([a-zA-Z]*) has wrote \((\d\d):(\d\d)\) 
Cecco has wrote (15:12) 

Found 1 match(es): 

start=0, end=23 
Group(0) = Cecco has wrote (15:12) 
Group(1) = Cecco 
Group(2) = 15 
Group(3) = 12 

對正則表達式語法一個優秀turorial可以在http://www.regular-expressions.info/tutorial.html