Python + splinter + http：Error - httplib.ResponseNotReady

使用splinter和Python，我有兩個線程正在運行，每個線程都訪問相同的主URL但路徑不同，例如，線程一個點擊：mainurl.com/threadone和線程兩支安打：用mainurl.com/threadtwo：Python + splinter + http：Error - httplib.ResponseNotReady

Traceback (most recent call last): 
    File "multi_thread_practice.py", line 299, in <module> 
    main() 
    File "multi_thread_practice.py", line 290, in main 
    first_method(r) 
    File "multi_thread_practice.py", line 195, in parser 
    second_method(title, name) 
    File "multi_thread_practice.py", line 208, in confirm_product 
    third_method(current_url) 
    File "multi_thread_practice.py", line 214, in buy_product 
    browser.visit(url) 
    File "/Users/joshua/anaconda/lib/python2.7/site-packages/splinter/driver/webdriver/__init__.py", line 184, in visit 
    self.driver.get(url) 
    File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 261, in get 
    self.execute(Command.GET, {'url': url}) 
    File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 247, in execute 
    response = self.command_executor.execute(driver_command, params) 
    File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 464, in execute 
    return self._request(command_info[0], url, body=data) 
    File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 488, in _request 
    resp = self._conn.getresponse() 
    File "/Users/joshua/anaconda/lib/python2.7/httplib.py", line 1108, in getresponse 
    raise ResponseNotReady() 
httplib.ResponseNotReady

什麼是錯誤，我應該如何去處理這個問題：

from splinter import Browser 
browser = Browser('chrome')

但是遇到下列錯誤來了？

謝謝你提前一定會給予好評/接受的答案

CODE ADDED

import time 
from splinter import Browser 
import threading 

browser = Browser('chrome') 

start_time = time.time() 

urlOne = 'http://www.practiceurl.com/one' 
urlTwo = 'http://www.practiceurl.com/two' 
baseUrl = 'http://practiceurl.com' 

browser.visit(baseUrl) 

def secondThread(url): 
    print 'STARTING 2ND REQUEST: ' + str(time.time() - start_time) 
    browser.visit(url) 
    print 'END 2ND REQUEST: ' + str(time.time() - start_time) 


def mainThread(url): 
    print 'STARTING 1ST REQUEST: ' + str(time.time() - start_time) 
    browser.visit(url) 
    print 'END 1ST REQUEST: ' + str(time.time() - start_time) 


def main(): 
    threadObj = threading.Thread(target=secondThread, args=[urlTwo]) 
    threadObj.daemon = True 

    threadObj.start() 

    mainThread(urlOne) 

main()

來源

2017-04-25 Jo Ko

httplib.ResponseNotReady通常是重用響應來執行。我不知道你是否因爲沒有代碼，但我認爲這是錯誤的。 –

這將有助於提供[MVC]（https://stackoverflow.com/help/mcve） – Adonis

@GenericSnake道歉。只需在原始文章中添加代碼即可。請看一下。 –

據我所知，你想做什麼是不可能的在一個瀏覽器上。斯普林特正在對一個實際的瀏覽器進行操作，因此，同時傳入許多命令會導致問題。它的行爲就像人類與瀏覽器交互（當然是自動的）。可以打開許多瀏覽器窗口，但不能在不接收前一個請求的響應的情況下，在其他線程中發送請求。這會導致CannotSendRequest錯誤。所以，我建議（如果您需要使用線程）打開兩個瀏覽器，然後使用線程通過每個瀏覽器發送請求。否則，它不能完成。

此線程在硒上，但信息是可轉移的。 Selenium multiple tabs at once同樣，這說明你想要的（我假設）要做的事情是不可能的。綠色答覆的答題者給出了和我一樣的建議。

希望不會讓你走得太遠，並幫助你。

編輯：只是爲了證明：

import time 
from splinter import Browser 
import threading 

browser = Browser('firefox') 
browser2 = Browser('firefox') 

start_time = time.time() 

urlOne = 'http://www.practiceurl.com/one' 
urlTwo = 'http://www.practiceurl.com/two' 
baseUrl = 'http://practiceurl.com' 

browser.visit(baseUrl) 


def secondThread(url): 
    print 'STARTING 2ND REQUEST: ' + str(time.time() - start_time) 
    browser2.visit(url) 
    print 'END 2ND REQUEST: ' + str(time.time() - start_time) 


def mainThread(url): 
    print 'STARTING 1ST REQUEST: ' + str(time.time() - start_time) 
    browser.visit(url) 
    print 'END 1ST REQUEST: ' + str(time.time() - start_time) 


def main(): 
    threadObj = threading.Thread(target=secondThread, args=[urlTwo]) 
    threadObj.daemon = True 

    threadObj.start() 

    mainThread(urlOne) 

main()

請注意，我用的Firefox，因爲我一直沒有得到chromedriver安裝。

在定時器開始之前，在瀏覽器打開後設置一個等待，以確保它們完全準備就緒可能是一個好主意。

來源

2017-04-28 01:11:49

欣賞您的輸入！事實上，我做了一些嘗試，但似乎兩個窗口沒有連接，這意味着，在一個窗口上操作與打開的其他窗口無關。所以我在考慮讓第二個線程在同一個窗口中打開一個新標籤。這可能嗎？ –

您應該能夠在一個瀏覽器窗口中打開新標籤，然後在其中打開不同的網址。但請記住，它不可能在您打開網址的同一時間。您必須等待一個選項卡接收其請求的響應，然後通過第二個選項卡發送請求。它使得線程化有點毫無意義。我認爲@asettouf對於我這個話題更加了解，所以他可能會更多地展示他的例子，這可以幫助你。 –

欣賞洞察無論。我怎樣才能打開一個新的標籤與'碎片'雖然呢？ –

@GenericSnake在這個問題上是正確的。爲了一點點添加到它，我會強烈建議您重構代碼使用multiprocessing library，主要是因爲線程實現使用GIL：

In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

實際使用多一個好處是，你可以重構你的代碼爲了避免重複的方法secondThread和mainThread，例如這種方式（一個過去的事情，不要忘記清理你使用的資源，像browser.quit()關閉瀏覽器一旦你完成）：

import time 
from splinter import Browser 
from multiprocessing import Process 
import os 

os.environ['PATH'] = os.environ[ 
         'PATH'] + "path/to/geckodriver" + "path/to/firefox/binary" 

start_time = time.time() 

urlOne = 'http://pythoncarsecurity.com/Support/FAQ.aspx' 
urlTwo = 'http://pythoncarsecurity.com/Products/' 



def url_visitor(url): 
    print("url called: " + url) 
    browser = Browser('firefox') 
    print('STARTING REQUEST TO: ' + url + " at "+ str(time.time() - start_time)) 
    browser.visit(url) 
    print('END REQUEST TO: ' + url + " at "+ str(time.time() - start_time)) 

def main(): 
    p1 = Process(target=url_visitor, args=[urlTwo]) 
    p2 = Process(target=url_visitor, args=[urlOne]) 
    p1.start() 
    p2.start() 
    p1.join() #join processes to the main process to see the output 
    p2.join() 

if __name__=="__main__": 
    main()

那給我們下面的輸出（定時w生病取決於您的系統雖然）：

url called: http://pythoncarsecurity.com/Support/FAQ.aspx 
url called: http://pythoncarsecurity.com/Products/ 
STARTING REQUEST TO: http://pythoncarsecurity.com/Support/FAQ.aspx at 10.763000011444092 
STARTING REQUEST TO: http://pythoncarsecurity.com/Products/ at 11.764999866485596 
END REQUEST TO: http://pythoncarsecurity.com/Support/FAQ.aspx at 16.20199990272522 
END REQUEST TO: http://pythoncarsecurity.com/Products/ at 16.625999927520752

編輯：多線程和硒的問題是，一個瀏覽器實例不是線程安全的，我發現了繞過這個問題的唯一辦法就是獲得鎖在url_visitor，但是，在這種情況下，你失去了多線程的優勢。這就是爲什麼我相信，使用多個瀏覽器是更有益的（雖然我猜你有一些非常具體的要求），請參閱下面的代碼：

import time 
from splinter import Browser 
import threading 
from threading import Lock 
import os 

os.environ['PATH'] = os.environ[ 
         'PATH'] + "/path/to/chromedriver" 

start_time = time.time() 

urlOne = 'http://pythoncarsecurity.com/Support/FAQ.aspx' 
urlTwo = 'http://pythoncarsecurity.com/Products/' 
browser = Browser('chrome') 
lock = threading.Lock()#create a lock for the url_visitor method 

def init(): 
    browser.visit("https://www.google.fr") 
    driver = browser.driver 
    driver.execute_script("window.open('{0}', '_blank');") #create a new tab 
    tabs = driver.window_handles 


def url_visitor(url, tabs): 
    with lock: 
     if tabs != 0: 
      browser.driver.switch_to_window(browser.driver.window_handles[tabs]) 
     print("url called: " + url) 
     print('STARTING REQUEST TO: ' + url + " at "+ str(time.time() - start_time)) 
     browser.visit(url) 
     print('END REQUEST TO: ' + url + " at "+ str(time.time() - start_time)) 
     browser.quit() 


def main(): 
    p1 = threading.Thread(target=url_visitor, args=[urlTwo, 0]) 
    p2 = threading.Thread(target=url_visitor, args=[urlOne, 1]) 
    p1.start() 
    p2.start() 

if __name__=="__main__": 
    init() #create a browser with two tabs 
    main()

來源

2017-04-28 09:19:59 Adonis

謝謝你的建議！我希望這兩個線程/進程都在同一個窗口上運行，以便與他們執行的操作有關係。有兩個窗戶，他們沒有關係。是否有一個進程/線程打開一個新的選項卡，但與主進程/線程在同一個窗口是可能的？先謝謝你！ –

@JoKo當你有兩個進程時，他們不會共享相同的內存，據我所知，使用單個窗口時，你將被困在多線程中。稍後我會以一個例子回來。 – Adonis

明白了。期待它。謝謝你！ –

Python + splinter + http：Error - httplib.ResponseNotReady

回答

相關問題