Python中的代理緩存服務器

我有一個家庭作業，包括在Python中實現代理緩存服務器。這個想法是在我的本地機器上寫入我訪問臨時文件的網頁，然後在請求進入時存取它們。眼下代碼如下所示：Python中的代理緩存服務器

from socket import * 
import sys 

def main(): 
    #Create a server socket, bind it to a port and start listening 
    tcpSerSock = socket(AF_INET, SOCK_STREAM) #Initializing socket 
    tcpSerSock.bind(("", 8030)) #Binding socket to port 
    tcpSerSock.listen(5) #Listening for page requests 
    while True: 
     #Start receiving data from the client 
     print 'Ready to serve...' 
     tcpCliSock, addr = tcpSerSock.accept() 
     print 'Received a connection from:', addr 
     message = tcpCliSock.recv(1024) 
     print message 

     #Extract the filename from the given message 
     print message.split()[1] 
     filename = message.split()[1].partition("/")[2] 
     print filename 
     fileExist = "false" 
     filetouse = "/" + filename 
     print filetouse 

     try: #Check whether the file exists in the cache 
      f = open(filetouse[1:], "r") 
      outputdata = f.readlines() 
      fileExist = "true" 
      #ProxyServer finds a cache hit and generates a response message 
      tcpCliSock.send("HTTP/1.0 200 OK\r\n") 
      tcpCliSock.send("Content-Type:text/html\r\n") 
      for data in outputdata: 
       tcpCliSock.send(data) 
      print 'Read from cache' 
     except IOError: #Error handling for file not found in cache 
      if fileExist == "false": 

       c = socket(AF_INET, SOCK_STREAM) #Create a socket on the proxyserver 
       hostn = filename.replace("www.","",1) 
       print hostn 
       try: 
        c.connect((hostn, 80)) #https://docs.python.org/2/library/socket.html 
        # Create a temporary file on this socket and ask port 80 for 
        # the file requested by the client 
        fileobj = c.makefile('r', 0) 
        fileobj.write("GET " + "http://" + filename + "HTTP/1.0\r\n") 
        # Read the response into buffer 
        buffr = fileobj.readlines() 
        # Create a new file in the cache for the requested file. 
        # Also send the response in the buffer to client socket and the 
        # corresponding file in the cache 
        tmpFile = open(filename,"wb") 
        for data in buffr: 
         tmpFile.write(data) 
         tcpCliSock.send(data) 
       except: 
        print "Illegal request" 
      else: #File not found 
       print "404: File Not Found" 
     tcpCliSock.close() #Close the client and the server sockets 

main()

爲了測試我的代碼，我在我的本地運行的代理緩存，並據此設置我的瀏覽器的代理服務器設置，像這樣

然而，當我運行此代碼並嘗試通過Chrome訪問谷歌，我打招呼與錯誤頁面說err_empty_response。

通過與調試代碼加強使我意識到它的失敗在這條線

c.connect((hostn, 80))

，我不知道爲什麼。任何幫助將不勝感激。

P.S.我正在使用谷歌瀏覽器，Python 2.7和Windows 10對此進行測試。

來源

2016-03-01 Nick Gilbert

切斷'www.'是危險的。沒有'www.''的名字也不需要解析，也不需要解析到具有'www.''的IP地址。 – jsfan

是的，這是有道理的。不幸的是，刪除部分我沒有修復問題 –

編號請參閱我的答案。您需要先解析名稱。檢查[getaddrinfo（）']文檔（https://docs.python.org/2/library/socket.html#socket.getaddrinfo） – jsfan

您不能在連接上使用名稱。 Connect需要一個IP地址來連接。

您可以使用getaddrinfo()獲取構建連接所需的套接字信息。在我pure-python-whois包我用下面的代碼來創建一個連接：

def _openconn(self, server, timeout, port=None): 
    port = port if port else 'nicname' 
    try: 
     for srv in socket.getaddrinfo(server, port, socket.AF_UNSPEC, socket.SOCK_STREAM, 0, socket.AI_ADDRCONFIG): 
      af, socktype, proto, _, sa = srv 
      try: 
       c = socket.socket(af, socktype, proto) 
      except socket.error: 
       c = None 
       continue 
      try: 
       if self.source_addr: 
        c.bind(self.source_addr) 
       c.settimeout(timeout) 
       c.connect(sa) 
      except socket.error: 
       c.close() 
       c = None 
       continue 
      break 
    except socket.gaierror: 
     return False 

    return c

注意，因爲循環是居然還有白白，而不是使用不同的選擇，這不是偉大的代碼。一旦建立連接，您應該只打破循環。但是，這應該作爲使用說明getaddrinfo()

編輯：您也沒有正確清理您的主機名。當我嘗試訪問http://www.example.com/時，我得到/www.example.com/，這顯然不會解決。我建議你使用正則表達式來獲取緩存的文件名。

來源

2016-03-01 05:10:42 jsfan

將行更改爲srv = c.getaddrinfo（filename，80），然後c.connect（（srv，80））拋出一個錯誤，說socket對象沒有任何屬性getaddrinfo –

您仍然必須確保文件名實際上是主機名。解決問題時，您不能在主機名中包含路徑。它只不過是一個主機名而已。 – jsfan

我剛剛通過調用c.getaddrinfo（（「www.google.com」，80））來測試它，並得到相同的錯誤。 _socketobject對象沒有任何屬性getaddrinfo –

Python中的代理緩存服務器

回答

相關問題