0
我要麼收到錯誤或沒有被解析/用下面的代碼寫成:解析HTML和使用Beautifulsoup寫入CSV - AttributeError的或沒有HTML被解析
soup = BeautifulSoup(browser.page_source, 'html.parser')
userinfo = soup.find_all("div", attrs={"class": "fieldWrapper"})
rows = userinfo.find_all(attrs="value")
with open('testfile1.csv', 'w') as outfile:
writer = csv.writer(outfile)
writer.writerow(rows)
行= userinfo.find_all(ATTRS =「值」)
AttributeError的:「結果集」對象有沒有屬性「find_all」
所以我嘗試了用打印循環只是爲了測試它,但在程序成功運行不返回任何內容:
userinfo = soup.find_all("div", attrs={"class": "fieldWrapper"})
for row in userinfo:
rows = row.find_all(attrs="value")
print(rows)
這是我試圖解析的html。我試圖從值返回文本屬性:
<div class="controlHolder">
<div id="usernameWrapper" class="fieldWrapper">
<span class="styled">Username:</span>
<div class="theField">
<input name="ctl00$cleanMainPlaceHolder$tbUsername" type="text" value="username" maxlength="16" id="ctl00_cleanMainPlaceHolder_tbUsername" disabled="disabled" tabindex="1" class="textbox longTextBox">
<input type="hidden" name="ctl00$cleanMainPlaceHolder$hdnUserName" id="ctl00_cleanMainPlaceHolder_hdnUserName" value="AAubrey">
</div>
</div>
<div id="fullNameWrapper" class="fieldWrapper">
<span class="styled">Full Name:</span>
<div class="theField">
<input name="ctl00$cleanMainPlaceHolder$tbFullName" type="text" value="Full Name" maxlength="50" id="ctl00_cleanMainPlaceHolder_tbFullName" tabindex="2" class="textbox longTextBox">
<input type="hidden" name="ctl00$cleanMainPlaceHolder$hdnFullName" id="ctl00_cleanMainPlaceHolder_hdnFullName" value="Anthony Aubrey">
</div>
</div>
<div id="emailWrapper" class="fieldWrapper">
<span class="styled">Email:</span>
<div class="theField">
<input name="ctl00$cleanMainPlaceHolder$tbEmail" type="text" value="[email protected]" maxlength="60" id="ctl00_cleanMainPlaceHolder_tbEmail" tabindex="3" class="textbox longTextBox">
<input type="hidden" name="ctl00$cleanMainPlaceHolder$hdnEmail" id="ctl00_cleanMainPlaceHolder_hdnEmail" value="[email protected]">
<span id="ctl00_cleanMainPlaceHolder_validateEmail" style="color:Red;display:none;">Invalid E-Mail</span>
</div>
</div>
<div id="commentWrapper" class="fieldWrapper">
<span class="styled">Comment:</span>
<div class="theField">
<textarea name="ctl00$cleanMainPlaceHolder$tbComment" rows="2" cols="20" id="ctl00_cleanMainPlaceHolder_tbComment" tabindex="4" class="textbox longTextBox"></textarea>
<input type="hidden" name="ctl00$cleanMainPlaceHolder$hdnComment" id="ctl00_cleanMainPlaceHolder_hdnComment">
</div>
</div>
我明白你的意思,我嘗試使用您提供卻又不打印輸出的代碼。我試圖從value =「username」value =「Full Name」value =「[email protected]」中獲取文本,因爲我試圖從表單中拉出文本。 – nvachhan
Gotcha。當我使用您提供的源HTML初始化BeautifulSoup時,我上面編輯的答案打印了預期的輸出。如果它仍然沒有打印出任何東西,它可能的'browser.page_source'不是你期望的,或者你的解析器不能正確處理頁面。 – lanceg
我試了一下你寫的新版本,但還是一無所有,我把'print:'('no text found')'看看它是否會打印任何東西,但仍然沒有,這似乎很奇怪,我認爲你是對的,也許是頁面源的錯誤。我正在使用硒來代碼中的這一點,沒有問題。 – nvachhan