我想寫一個腳本,用Python和BeautifulSoup刮一個網站,然後將數據寫入和excel工作表。Python BeautifulSoup刮數據寫入Excel「NotImplementedError」
它的工作直到寫作部分,然後我得到NotImplementedError
?我查了一下,然後用TRY:和Pass:blocks ....將代碼的寫入部分包圍起來。它解決了Python解釋器控制檯窗口中的錯誤,但是我的Excel表單是空白的。
這是我到目前爲止有:
import requests, openpyxl
from bs4 import BeautifulSoup
wb = openpyxl.Workbook('RDWM_CRM.xls')
wb.create_sheet('Phone')
sheet = wb.get_sheet_by_name('Phone')
# nav to webpage I want to scrape
url = "http://www.yellowpages.com/search?search_terms=roofing%20company&geo_location_terms=New%20York%2C%20NY&page=2"
r = requests.get(url)
soup = BeautifulSoup(r.content)
# for loop finds info then prints
for div in soup.find_all("div", {"class": "info"}):
print (div.contents[0].text)
print (div.contents[1].text)
# for loop finds info then writes to excel cells
for div in soup.find_all("div", {"class": "info"}):
sheet['A1'] = div.contents[0].text
sheet['B1'] = div.contents[1].text
wb.save('RDWM_CRM.xls')
就像我上面說的,即使沒有任何錯誤,我得到一個空白Excel工作表。這是在控制檯中看到的回溯:
Neptune Construction
Serving the New York Area.(866) 664-1759
>>> # for loop finds info then writes to excel cells
... for div in soup.find_all("div", {"class": "info"}):
... sheet['A1'] = div.contents[0].text
... sheet['B1'] = div.contents[1].text
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "C:\Users\Josh\AppData\Local\Programs\Python\Python35\lib\site-packages\openpyxl\writer\write_only.py", line 223, in removed_method
raise NotImplementedError
NotImplementedError
>>> wb.save('RDWM_CRM.xls')
這是最後一塊數據以及錯誤。
感謝您的幫助!我仍然遇到excel工作表空白...這裏是我使用的代碼,沒有錯誤....只是一個空白的Excel表。它創建了名爲電話的新表,它只是空白...
import requests
from bs4 import BeautifulSoup
from openpyxl import Workbook
url = "http://www.yellowpages.com/search?search_terms=roofing%20company&geo_location_terms=Seattle%2C%20WA&page=4" # nav to webpage I want to scrape
r = requests.get(url)
soup = BeautifulSoup(r.content)
# create a dummy list of texts to write to excel file
divs = []
wb = Workbook() # open new workbook, use load_workbook if existing
ws = wb.create_sheet('Phone')
for div in divs:
row = [div.contents[0].text, div.contents[1].text] # construct a row: shown only for example purposes
ws.append(row) # could use ws.append(div) since each div is a list
wb.save('RDWM_CRM.xlsx') # save workbook, will overwrite if exists
任何幫助表示讚賞!
請包含回溯,錯誤是否發生在'wb.save'中? – memoselyk
第二個for循環,應該打印。 – user3429394
回溯(最近通話最後一個): 文件 「」,3號線,在 文件「C:\用戶\喬希\應用程序數據\本地\程序\ Python的\ Python35 \ LIB \站點包\ o penpyxl \作家\ write_only.py」,線路223,在removed_method 提高NotImplementedError NotImplementedError >>> wb.save( 'RDWM_CRM.xls') –
user3429394