BeautifulSoup有可能以不區分大小寫的方式工作？

我正在嘗試爲提取的網頁提取元描述。但是在這裏我正面臨着BeautifulSoup的大小寫敏感問題。BeautifulSoup有可能以不區分大小寫的方式工作？

由於部分頁面有<meta name="Description，有些頁面有<meta name="description。

唯一的區別是，我無法使用LXML ..我必須堅持Beautifulsoup。

2010-04-08 Nitin

只需稍作更改即可。

soup.findAll('meta', attrs={'name':re.compile("^description$", re.I)})

2010-04-09 07:03:55 Nitin

您可以給BeautifulSoup一個正則表達式來匹配屬性。類似於

soup.findAll('meta', name=re.compile("^description$", re.I))

可能會訣竅。從the BeautifulSoup docs打折。

2010-04-08 18:26:22

正則表達式？現在我們有another problem。

相反，你可以在一個拉姆達傳：

soup.findAll(lambda tag: tag.name.lower()=='meta', 
    name=lambda x: x and x.lower()=='description')

（x and避免異常時name屬性的標籤沒有被定義）

2013-03-07 17:49:57 MikeyB

+1用於避免正則表達式。 xkcd鏈接+1。 – FlipMcF 2013-05-04 00:45:28

使用bs4我得到「find_all（）得到了多個值的關鍵字參數'名稱'」與：/ – Joaolvcm 2014-02-20 11:14:08

隨着BS4使用以下命令：

soup.find('meta', attrs={'name': lambda x: x and x.lower()=='description'})

2015-05-31 12:47:12 Emmanuel

回答