Python中添加一個字符串匹配列表與多個項目

我的工作是由有2場，URL和標題的HTML頁面檢索列表中的代碼...Python中添加一個字符串匹配列表與多個項目

的URL反正有/URL....啓動，並我需要附加「http://website.com」給每個從re.findall返回的變化。

到目前爲止的代碼是這樣的：

bsoup=bs(html) 
tag=soup.find('div',{'class':'item'}) 
reg=re.compile('<a href="(.+?)" rel=".+?" title="(.+?)"') 
links=re.findall(reg,str(tag)) 
*(append "http://website.com" to the href"(.+?)" field)* 
return links

來源

2015-12-25 Aenema

http://stackoverflow.com/a/1732454/1459669請使用美麗的湯來找到鏈接！ –

@CrazyPython除非你想召喚克蘇魯。 – timgeb

@timgeb你永遠不知道，他可能想要召喚他。然後我們需要將它遷移到StackExchange Skeptics或Worldbuilding ... –

嘗試：

for link in tag.find_all('a'): 
    link['href'] = 'http://website.com' + link['href']

然後使用這些輸出方法之一：

return str(soup)應用更改後，讓你的文檔。

return tag.find_all('a')獲取所有鏈接元素。

return [str(i) for i in tag.find_all('a')]讓您將所有鏈接元素轉換爲字符串。

現在，不要試圖用正則表達式解析HTML，而你的已經有一個XML解析器正在工作。

來源

2015-12-26 00:11:00

糟糕，我的不好。網址附件的反轉順序。 –

Python中添加一個字符串匹配列表與多個項目

回答

相關問題