Python的正則表達式 - 排除包含一個字

我有一個正則表達式的問題網址 - 我有網址的4個例子：Python的正則表達式 - 排除包含一個字

http://auto.com/index.php/car-news/12158-classicauto-cup-2016-photo 
http://auto.com/index.php/car-news/11654-battle-royale-2014 
http://auto.com/index.php/tv-special-news/10480-new-film-4 
http://auto.com/index.php/first/12234-new-volvo-xc60

我想用「TV-特殊新聞」中或「照片排除網址' 最後。

我已經試過：

http://(www.)?auto.com/index.php/(?!(tv-special-news)).*/[a-zA-Z0-9\-]{1,}-(?!photo)

，但它並不完全工作，我想

來源

2017-08-01 Alek SZ

我認爲你可以做到這一點沒有正則表達式，在url'使用'「TV-特殊新聞」和'.endswith' –

不幸的是我需要的正則表達式:) –

http://(www.)?auto.com/index.php/(?!(tv-special-news)).*/[a-zA-Z0-9\-]{1,}-(?!photo)

你近在咫尺。您只需在(?!photo)之前刪除短劃線以允許線路在沒有尾隨短劃線的情況下結束，並在最後添加$以確保整行需要匹配。

然後，你也將不得不改變負先行轉負的外觀後面，以確保如果它是由前photo你是不匹配的行結束：(?<!photo)。

http://(www.)?auto.com/index.php/(?!(tv-special-news)).*/[a-zA-Z0-9\-]{1,}(?<!photo)$

此外，你應該正確地逃避所有的點：

http://(www\.)?auto\.com/index\.php/(?!(tv-special-news)).*/[a-zA-Z0-9\-]+(?<!photo)$

此外，量詞{1,}相當於+。

來源

2017-08-01 15:59:58 poke

非常感謝！ –

這是不正確的正則表達式。 [你可以看到這個演示]（https://regex101.com/r/OEhDvU/3） – anubhava

@anubhava我的不好，OP的輸入有一個尾隨空間，這讓我錯過了這個。現在修復它，謝謝！ – poke

你可以使用這個表達式：

^(?!.*-photo$)http://(?:www\.)?auto\.com/index\.php/(?!tv-special-news)[^/]+/[\w-]+-

RegEx Demo 1

(?!.*-photo$)是否定的如果URL以photo結尾，則前瞻性失敗。
(?!tv-special-news)當tv-special-news出現在/index.php/之後時，顯示爲負面，以斷定失敗。
最好使用開始錨在您的正則表達式

或者與回顧後發正則表達式，你可以使用：

^http://(www\.)?auto\.com/index\.php/(?!tv-special-news).*/[a-zA-Z0-9-]+$(?<!photo)

RegEx Demo 2

來源

2017-08-01 15:54:24 anubhava

您可以使用此解決方案：

import re 

list_of_urls = ["http://auto.com/index.php/car-news/12158-classicauto-cup-2016-photo",....] 


new_list = [i for i in list_of_urls if len(re.findall("photo+", i.split()[-1])) == 0 and len(re.findall("tv-special-news+", i.split()[-1])) == 0]

來源

2017-08-01 15:58:12 Ajax1234

謝謝，但我需要正則表達式 –

您可以簡單地存儲你的鏈接列表，並使用正則表達式迭代它：

re_pattern = R '\ B（：TV-特殊新聞|圖片）\ B'

回覆。的findall（re_pattern，鏈接）

（其中鏈接會從列表中的項目）

如果圖案接着匹配，就會將結果存儲在列表中。你將不得不檢查列表是否爲空。如果列表爲空，則可以包含該鏈接，否則將其排除。

下面是示例代碼：

import re 

links = ['http://auto.com/index.php/car-news/12158-classicauto-cup-2016-photo', 'http://auto.com/index.php/car-news/11654-battle-royale-2014', 'http://auto.com/index.php/tv-special-news/10480-new-film-4', 'http://auto.com/index.php/first/12234-new-volvo-xc60'] 

new_list = [] 

re_pattern = r'\b(?:tv-special-news|photo)\b' for link in links: result = re.findall(re_pattern,link)  if len(result) < 1:   new_list.append(link) 

print new_list

來源

2017-08-01 16:18:46

Python的正則表達式 - 排除包含一個字

回答

相關問題