2011-06-27 49 views
9

我想編寫一個簡單的代理來攪拌請求頁面正文中的文本。我已經閱讀了扭曲的文檔和其他一些類似的問題在這裏stackoverflow的一部分,但我有點noob,所以我仍然沒有得到它。需要幫助寫一個扭曲的代理

這是我現在,我不知道如何訪問和修改頁面

from twisted.web import proxy, http 
from twisted.internet import protocol, reactor 
from twisted.python import log 
import sys 

log.startLogging(sys.stdout) 

class ProxyProtocol(http.HTTPChannel): 
    requestFactory = PageHandler 

class ProxyFactory(http.HTTPFactory): 
    protocol = ProxyProtocol 

if __name__ == '__main__': 
    reactor.listenTCP(8080, ProxyFactory()) 
    reactor.run() 

你能幫幫我嗎?我會欣賞一個簡單的例子(例如添加一些東西給身體等)。

回答

6

我所做的是實現一個新的ProxyClient,在我從Web服務器下載數據之後修改數據,然後將其發送到Web瀏覽器。

from twisted.web import proxy, http 
class MyProxyClient(proxy.ProxyClient): 
def __init__(self,*args,**kwargs): 
    self.buffer = "" 
    proxy.ProxyClient.__init__(self,*args,**kwargs) 
def handleResponsePart(self, buffer): 
    # Here you will get the data retrieved from the web server 
    # In this example, we will buffer the page while we shuffle it. 
    self.buffer = buffer + self.buffer 
def handleResponseEnd(self): 
    if not self._finished: 
    # We might have increased or decreased the page size. Since we have not written 
    # to the client yet, we can still modify the headers. 
    self.father.responseHeaders.setRawHeaders("content-length", [len(self.buffer)]) 
    self.father.write(self.buffer) 
    proxy.ProxyClient.handleResponseEnd(self) 

class MyProxyClientFactory(proxy.ProxyClientFactory): 
protocol = MyProxyClient 

class ProxyRequest(proxy.ProxyRequest): 
protocols = {'http': MyProxyClientFactory} 
ports = {'http': 80 } 
def process(self): 
    proxy.ProxyRequest.process(self) 

class MyProxy(http.HTTPChannel): 
requestFactory = ProxyRequest 

class ProxyFactory(http.HTTPFactory): 
protocol = MyProxy 

希望這也適用於你。

+0

當我的代理請求獲得404錯誤響應時,我收到「Unferredled error in Deferred:Failure:twisted.web.error.Error:404 Not Found」。我如何發現這個錯誤? –