用Python解析JavaScript href

一直以來都遇到了很多麻煩......對Python來說新手如此，對不起，如果我只是不知道正確的搜索條件來自己查找信息。我甚至沒有積極性，這是因爲JS，但這是我擁有的最好主意。用Python解析JavaScript href

這裏是我解析HTML的部分：

... 
<div class="promotion"> 
    <div class="address"> 
     <a href="javascript:PropDetail2('57795471:MRMLS')" title="View property detail for 5203 Alhama Drive">5203 Alhama Drive</a> 
    </div> 
</div> 
...

...和Python的我用做（這個版本我已經得到了成功最接近）：

homeFinderSoup = BeautifulSoup(open("homeFinderHTML.html"), "html5lib") 
addressClass = homeFinderSoup.find_all('div', 'address') 
for row in addressClass: 
    print row.get('href')

...返回

None 
None 
None

來源

2012-05-29 Z J Rollyson

沒有挖我nto文檔或任何東西，它看起來像你的代碼遍歷所有的div與類地址，並尋找一個他們沒有的href屬性。您需要獲取這些div內的所有錨定標記，然後查找THOSE的href屬性以獲取您要查找的內容。 –

在導航樹時遇到了問題，列表一直在拋棄我。讓我確定正確的方向，謝謝。 –

# Create soup from the html. (Here I am assuming that you have already read the file into 
# the variable "html" as a string). 
soup = BeautifulSoup(html) 
# Find all divs with class="address" 
address_class = soup.find_all('div', {"class": "address"}) 
# Loop over the results 
for row in address_class: 
    # Each result has one <a> tag, and we need to get the href property from it. 
    print row.find('a').get('href')

來源

2012-05-29 18:03:56 varunl

這很有效，非常好，謝謝。之前一直在嘗試.find_all（），不起作用。 –

用Python解析JavaScript href

回答

相關問題