2015-11-30 62 views
0

我試圖用硒和Python來打印一些信息,但它打印不是所有的CSS的路徑,這是while循環怎麼會只是一個信息打印信息..如何使用硒

pageIndex = 1 
while True: # Keep looping through all pages 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    for link in links: 
     urls = link.text 


    jobs = browser.find_elements_by_css_selector('div > div.description') 
    for title in jobs: 
     jobtitles = title.text 


    with open("1Exportdata.csv", "a") as csvFile: 
     csvFile.write(url + "," + jobtitle + "\n") 

    pageIndex += 1 
    if pageIndex == 5010: 
     browser.close() 
+0

剛剛運行'for'循環爲'urls'和'jobtitles'分配新值有什麼意義? – Andersson

+0

剛剛更新的全閉環 – Sarfraz

回答

2

因爲你使用:

for title in jobs: 
    jobtitles = title.text 

在第一循環中,jobtitles是第一title.text,但隨後,在第二循環中,它成爲第二個title.text。最後它會成爲最後的title.text

例如:

>>> for i in [1, 2, 3]: 
...  num = i 
>>> print(num) 
3 
>>> 

所以你需要寫with open("1Exportdata.csv", "a") as csvFile:for循環中。因爲你有兩個名單,我建議你使用zip類壓縮他們:

pageIndex = 1 
while True: # Keep looping through all pages 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    jobs = browser.find_elements_by_css_selector('div > div.description') 

    for link, title in zip(links, jobs): 
     url = link.text 
     jobtitle = title.text 


     with open("1Exportdata.csv", "a") as csvFile: 
      csvFile.write(url + "," + jobtitle + "\n") 

    pageIndex += 1 
    if pageIndex == 5010: 
     browser.close() 

而且我認爲使用while循環是沒用的,儘量使用for循環,而不是:

for pageIndex in range(1, 5011): 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    jobs = browser.find_elements_by_css_selector('div > div.description') 

    for link, title in zip(links, jobs): 
     url = link.text 
     jobtitle = title.text 


     with open("1Exportdata.csv", "a") as csvFile: 
      csvFile.write(url + "," + jobtitle + "\n") 
+0

錯誤 >>> 文件 「C:\用戶\ WIN \下載\ Runprog \項目cusom TAIO \ Profiles文件\ 1.py」 51行,在刮板 csvFile.write(URL + 「,」+ jobtitle +「\ n」) 文件「C:\ Python34 \ lib \ encodings \ cp1252.py」,第19行,編碼爲 返回codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError:'charmap'編解碼器無法編碼字符'\ u2601'在位置15:字符映射到 Sarfraz

+0

@Sarfraz:嗯......只是一個錯字......現在呢? –

+0

我已經變成單數已經 文件「C:\ Users \ Win \ Downloads \ Runprog \ Project cusom taio \ Profiles \ 1.py」,line 51,in scraper csvFile.write(url +「,」返回codecs.charmap_encode(輸入,self.errors,encoding_table)[0] UnicodeEncodeError:''在文件「C:\ Python34 \ lib \ encodings \ cp1252.py」中,第19行,編碼爲 。 charmap'編解碼器不能在位置15編碼字符'\ u2601':字符映射到 這裏是錯誤 – Sarfraz