的Python - 檢測空的URL - 使用字符串操作

我需要解析文件和檢測空URLS 這些場景：的Python - 檢測空的URL - 使用字符串操作

href = ''(ideally) 
href  = ' '

兩種情況下，雖然在第二位，工作方式相同。我所做的是將文件中的所有文本轉換爲字符串變量'searchstring' .i已經使用 searchstring.find('href = '')不等於-1，對於上面的前一種情況，但如果在第二種情況下有不同的空格，我不確定我是什麼需要做，以確保我也趕上這些情況... 最初我想使用指數來搭配指數，然後然後遍歷，但它似乎對我來說是一個辛苦的解決方案.... 它可能看起來很愚蠢，但剛剛接觸python，自從昨天開始學習。任何人都可以分享一些見解

感謝很多提前，菲利普

來源

2013-07-19 Kruizer

檢查長度（或更好，但在bool）的href.strip()：

In [47]: href = '' 

In [48]: len(href.strip()) 
Out[48]: 0 

In [49]: bool(href.strip()) 
Out[49]: False 

In [50]: href = ' ' 

In [51]: len(href.strip()) 
Out[51]: 0 

In [52]: bool(href.strip()) 
Out[52]: False

來源

2013-07-19 03:54:07 inspectorG4dget

聽起來不錯，也有串的另一種模式，我需要注意，即window.open ..... foreg window.open（'HTTP ：//www.google.com'）; – Kruizer

在這裏我需要檢測window.open是否爲空，是否有任何內置函數？ – Kruizer

我不知道'window.open'返回了什麼，所以我不知道如何檢查它是否爲空 – inspectorG4dget

你爲什麼不脫衣HREF

href = href.strip()

或者

if href.strip(): 
    print "not empty" 
else: 
    print "empty"

來源

2013-07-19 03:56:24 misguided

您可以使用re。你最好閱讀documentation。

>>> import re 
>>> s='href=""adjfweofhref=" "' 
>>> pattern = re.compile(r'href=[\"\']\s*[\"\']') 
>>> pattern.findall(s) 
['href=""', 'href=" "'] 
>>>

來源

2013-07-19 03:57:12 zhangyangyu

我想通過安裝BeautifulSoup開始......然後我剛過文件循環，讓它爲您做解析。

從那裏，你可以這樣做：

## import re ## Don't actually need a regex here: 

for link in soup.find_all('a'): 
    if not link.get('href').strip(): 
     print link, "... is empty or spacey" 
    ## elif re.search(r'^\s*$',link.get('href')): 
     ## print link, "... is spacey"

來源

2013-07-19 03:59:07

其實inspectorG4dget的文章提醒我可以使用.strip（）來簡化它，如上所述（在我的下一個編輯） –

的Python - 檢測空的URL - 使用字符串操作

回答

相關問題