帶有BeautifulSoup的Python XMl解析器。我如何刪除標籤？

對於一個項目，我決定製作一個應用程序，幫助人們在Twitter上找到朋友。帶有BeautifulSoup的Python XMl解析器。我如何刪除標籤？

我已經能夠從xml頁面獲取用戶名。因此，例如對於我的當前代碼，我可以從XML頁面獲得<uri>http://twitter.com/username</uri>，但我想要使用Beautiful Soup刪除<uri>和</uri>標記。

這裏是我當前的代碼：

import urllib 
import BeautifulSoup 

doc = urllib.urlopen("http://search.twitter.com/search.atom?q=travel").read() 

soup = BeautifulStoneSoup(''.join(doc)) 
data = soup.findAll("uri")

來源

2011-07-17 Geroge

下面的答案是否有幫助？ – Johnsyweb

要獲得關於BeautifulSoup的問題，text是你需要抓住每個<uri>標籤的內容是什麼。在這裏，我提取信息到一個列表理解：

>>> uris = [uri.text for uri in soup.findAll('uri')] 
>>> len(uris) 
15 
>>> print uris[0] 
http://twitter.com/MarieJeppesen

但是，as zeekay says，Twitter's REST API是用於查詢Twitter的更好的方法。

來源

2011-07-17 00:35:37 Johnsyweb

請勿使用BeautifulSoup解析Twitter，請使用它們的API（也不要使用BeautifulSoup，請使用lxml）。回答你的問題：

import urllib 
from BeautifulSoup import BeautifulSoup 

resp = urllib.urlopen("http://search.twitter.com/search.atom?q=travel") 
soup = BeautifulSoup(resp.read()) 
for uri in soup.findAll('uri'): 
    uri.extract()

來源

2011-07-17 00:35:52 zeekay

你給我的代碼仍然有Uri標籤在twitter.com/username – Geroge

它不應該，所有的標籤都應該從'soup'，str（湯）.find（'uri'）= = -1'。 – zeekay

帶有BeautifulSoup的Python XMl解析器。我如何刪除標籤？

回答

相關問題