2016-10-12 107 views
0

我得到了下面的字符串:只有符合特定領域

<a href="/web/20120412083942/http://test.com/contact">Contact Us</a> | <a href="/web/20120412083942/https://test.com/privacy-policy">Privacy Policy</a> <br /><br /> 
<a href="/web/20120412083942/http://www.cassandracastanedaphoto.com/index2.php#/home/">Photography by Cassandra Castenada</a></span><!-- Start Shareaholic TopSharingBar Automatic --><!-- End Shareaholic TopSharingBar Automatic --><script src="/web/20120412083942js_/http://www.test.com/wp-content/plugins/tweetmeme/button.js" type="text/javascript"></script> 
<!-- tracker added by Ultimate Google Analytics plugin v1.6.0: /web/20120412083942/http://www.oratransplant.nl/uga --> 

我想比賽:

/網絡/ 20120412083942/http://test.com

/網絡/ 20120412083942/https://test.com

/web/20120412083942js_/http://www.test.com

基本上具有網絡/ [數字]任何網址[潛在字符串]/http://test.com

這裏是我的正則表達式到目前爲止:

((http(s)?:\/\/)?web.archive.org)?\/web\/\d+.*?\/http(s)?:\/\/(www\.)?test\.com 

的問題是,它匹配了整節:

/網絡/ 20120412083942/http://www.cassandracastanedaphoto.com/index2.php#/home/「>攝影 通過卡桑德拉Castenadahttp://test.com

我該如何做到這一點,所以它停止後,域名沒有開始與test.com?

回答

1

我成功與這個正則表達式模式:

Pattern: /web/[^/]+/http[s]{0,1}://(|www\.)test\.com/?[._a-zA-Z-0-9]+ 

Options:^and $ match at line breaks 

Match the characters 「/web/」 literally «/web/» 
Match any character that is NOT a 「/」 «[^/]+» 
    Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
Match the characters 「/http」 literally «/http» 
Match the character 「s」 «[s]{0,1}» 
    Between zero and one times, as many times as possible, giving back as needed (greedy) «{0,1}» 
Match the characters 「://」 literally «://» 
Match the regular expression below and capture its match into backreference number 1 «(|www\.)» 
    Match either the regular expression below (attempting the next alternative only if this one fails) «» 
     Empty alternative effectively makes the group optional (following alternatives will be tried if the regex backtracks into the group) «|» 
    Or match regular expression number 2 below (the entire group fails if this one fails to match) «www\.» 
     Match the characters 「www」 literally «www» 
     Match the character 「.」 literally «\.» 
Match the characters 「test」 literally «test» 
Match the character 「.」 literally «\.» 
Match the characters 「com」 literally «com» 
Match the character 「/」 literally «/?» 
    Between zero and one times, as many times as possible, giving back as needed (greedy) «?» 
Match a single character present in the list below «[._a-zA-Z-0-9]+» 
    Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
    One of the characters 「._」 «._» 
    A character in the range between 「a」 and 「z」 «a-z» 
    A character in the range between 「A」 and 「Z」 «A-Z» 
    The character 「-」 «-» 
    A character in the range between 「0」 and 「9」 «0-9» 
+0

使用使用RegexBuddy軟件有人不禁要問,出口是:) –

+0

哇,這是很好的事測試!完美的作品! 謝謝:) –