1
我一直在嘗試編碼列表price['value']
,得到錯誤AttributeError: 'list' object has no attribute 'encode'
。在意識到這個問題之後,我已經嘗試了很多不同的方式來在文本添加到列表之前對文本進行編碼,但都沒有奏效。 在這種情況下,如何正確使用.encode('utf-8')
以便通過編碼文本而不是列表來獲得price['value']
結果中的非Unicode數據?如何迭代和編碼列表文本而不是列表?
import mechanize
from lxml import html
import csv
import io
from time import sleep
def save_products (products, writer):
for product in products:
writer.writerow([ product["title"][0].encode('utf-8') ])
for price in product['prices']:
writer.writerow([ price["value"] ])
f_out = open('pcdResult.csv', 'wb')
writer = csv.writer(f_out)
links = ["http://purechemsdirect.com/ourprices.html/" ]
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
for link in links:
print(link)
r = br.open(link)
content = r.read()
products = []
tree = html.fromstring(content)
product_nodes = tree.xpath('//div[@class="col-md-6 col-lg-6 col-sm-12"]')
for product_node in product_nodes:
product = {}
try:
product['title'] = product_node.xpath('.//p/strong/text()')
except:
product['title'] = ""
price_nodes = product_node.xpath('.//ul')
product['prices'] = []
for price_node in price_nodes:
price = {}
try:
price['value'] = price_node.xpath('.//li/text()')
except:
price['value'] = ""
product['prices'].append(price)
products.append(product)
save_products(products, writer)
f_out.close()
@ qwertyuio9是的,你是對的它是字符串,而不是其他列表存儲在這裏。我將代碼放在哪裏? – McLeodx
@McLeodx我編輯了我的答案,讓我知道如果這不起作用 – qwertyuip9
完美的工作!謝謝 – McLeodx