2017-01-22 6 views
2

我有一個使用Selenium Python 3.6收集數據爲元組的問題。這是我想要收集數據的頁面(http://www.bobaedream.co.kr/cyber/CyberCar.php?gubun=I) 我想在頁面上部的搜索菜單中收集「製造商(製造商)」數據。使用Selenium Python收集數據作爲元組

enter image description here

我用硒webdrivet設置虛擬頁面,並使用此代碼來收集和選擇第一個下拉菜單列表:

from selenium import webdriver 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.common.exceptions import StaleElementReferenceException 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.support.ui import Select 
from selenium.common.exceptions import NoSuchElementException 

from bs4 import BeautifulSoup 
from time import sleep 


link = 'http://www.bobaedream.co.kr/cyber/CyberCar.php?gubun=I' 
driver = webdriver.PhantomJS() 
driver.set_window_size(1920, 1080) 
driver.get(link) 
sleep(.75) 

soup = BeautifulSoup(driver.page_source, "html.parser", from_encoding='utf-8') 

manufacturers = [ 
    ('%s', '%s') % (o.text, o.get_attribute('href')) 
    for o 
    in driver.find_elements_by_css_selector("#layer_maker ul.list li a") 
    if o.text != '전체'] 

for manufacturer in manufacturers: 
    driver.execute_script("o.get_attribute('href')") 

而且,這是錯誤消息,我有:

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/chongwonshin/PycharmProjects/Crawler_test/dump.py 
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bs4/__init__.py:146: UserWarning: You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored. 
    warnings.warn("You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored.") 
Traceback (most recent call last): 
    File "/Users/chongwonshin/PycharmProjects/Crawler_test/dump.py", line 23, in <module> 
    in driver.find_elements_by_css_selector("#layer_maker ul.list li a") 
    File "/Users/chongwonshin/PycharmProjects/Crawler_test/dump.py", line 24, in <listcomp> 
    if o.text != '전체'] 
TypeError: unsupported operand type(s) for %: 'tuple' and 'tuple' 

Process finished with exit code 1 

請幫助。

回答

1

我想這是你所需要的:

[ 
('%s' % o.text, '%s' % o.get_attribute('href')) 
for o 
in driver.find_elements_by_css_selector("#layer_maker ul.list li a") 
if o.text != '전체'] 

或只是

[ 
(o.text, o.get_attribute('href')) 
for o 
in driver.find_elements_by_css_selector("#layer_maker ul.list li a") 
if o.text != '전체'] 

注意%也是一個「模」運營商在Python,你不能適用於元組