1
將多個類別的網頁抓取到csv中。成功獲得第一類成列,但第二列數據不寫入csv。我正在使用的代碼:抓取網站將數據移動到多個csv列
import urllib2
import csv
from bs4 import BeautifulSoup
url = "http://digitalstorage.journalism.cuny.edu/sandeepjunnarkar/tests/jazz.html"
page = urllib2.urlopen(url)
soup_jazz = BeautifulSoup(page)
all_years = soup_jazz.find_all("td",class_="views-field views-field-year")
all_category = soup_jazz.find_all("td",class_="views-field views-field-category-code")
with open("jazz.csv", 'w') as f:
csv_writer = csv.writer(f)
csv_writer.writerow([u'Year Won', u'Category'])
for years in all_years:
year_won = years.string
if year_won:
csv_writer.writerow([year_won.encode('utf-8')])
for categories in all_category:
category_won = categories.string
if category_won:
csv_writer.writerow([category_won.encode('utf-8')])
它將列標題寫入第二列而不是category_won。
根據您的建議,我已把它編譯閱讀:
with open("jazz.csv", 'w') as f:
csv_writer = csv.writer(f)
csv_writer.writerow([u'Year Won', u'Category'])
for years, categories in zip(all_years, all_category):
year_won = years.string
category_won = categories.string
if year_won and category_won:
csv_writer.writerow([year_won.encode('utf-8'), category_won.encode('utf-8')])
但現在我已經收到以下錯誤:
csv_writer.writerow([year_won.encode( 'UTF-8' ),category_won.encode( 'UTF-8')]) ValueError異常:I/O操作上關閉的文件
只是去嘗試,現在我上面列出得到一個錯誤。 – user1922698
@ user1922698:然後,您正在嘗試運行'with'語句的*外部*循環。 –
但上面生成的內容一次又一次地顯示了同一類別,但它們都是不同的類別。 – user1922698