2015-07-01 59 views
-1

由於某些原因,我想發送原始http頭到服務器,可以python 請求這樣做嗎?例如,http頭這樣,如何發送原始http頭

GET http://baidu.com/ HTTP/1.1 
Host: baidu.com 
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 
Accept-Language: en-US,en;q=0.5 
Accept-Encoding: gzip, deflate 
Connection: keep-alive 

我發現扭曲可以做到這一點,但它是一個有點複雜。

回答

2

使用twisted

from twisted.internet import reactor 
from twisted.web.client import Agent 
from twisted.web.http_headers import Headers 

agent = Agent(reactor) 

d = agent.request(
    'GET', 
    'http://baidu.com/', 
    Headers({ 
      'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'], 
      'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 
      'Accept-Language': ['en-US,en;q=0.5'], 
      'Accept-Encoding': ['gzip, deflate'], 
      'Connection': ['keep-alive'] 
     }), 
    None) 

def Response(null): 
    print('Response received') 

def Shutdown(null): 
    print('Shutting down the reactor now') 
    reactor.stop() 

d.addCallback(Response)  # exec Response() after request is rcvd 
d.addBoth(Shutdown)   # shut down after response rcvd 
reactor.run() 

更復雜的(尤其是如果你想「做的東西」與響應),但twisted是你應該知道你是否打算在Python中進行Web或併發編程。希望這可以幫助你,如果不是,我希望它可以幫助有人在HTTP標題和twisted掙扎。

編輯 - 2016年3月7日

使用treq

from __future__ import print_function 
from treq import get 
from twisted.internet.task import react 


def handleResponse(response): 
    """ Callback Function 

    Once the response is recived, display the information. 
    This is the part where I suspect people will have the most 
    trouble wrapping their heads around since it's heavily 
    dependent on deferreds (ie. futures or promises). 
    """ 
    print('Code: %s\n' % response.code) 

    print('Simple print:') 
    response.content().addCallback(print)  # simple way to print on py2 & py3 

    text = response.text()      # returns a deferred 
    text.addCallback(displayText)    # the way you should be handling responses, ie. via callbacks 

def displayText(text): 
    """ Callback Function 

    Simply display the text. You would usually do more useful 
    things in this call back, such as maniuplating the response 
    text or setting the text to some global or otherwise accessible 
    variable(s). 
    """ 
    print('Deferred print:') 
    print(text) 

def main(reactor): 
    """ 
    This is the main function which will execute a request using the 
    GET method. After getting the response, the response code and content 
    will be displayed. Finally, the twisted reactor will stop (since 
    the react function is being used). 
    """ 
    url = 'http://baidu.com/' 
    header={ 
     'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'], 
     'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 
     'Accept-Language': ['en-US,en;q=0.5'], 
     'Accept-Encoding': ['gzip, deflate'], 
     'Connection': ['keep-alive']} 

    d = get(url, headers=header) 
    d.addCallback(handleResponse) 
    return d 


react(main)   # run the main function and display results 

treq包裝更容易比使用twisted直接使用,而且許多共同的特點和requests語法。

參考

+0

謝謝,你的回答比我預期的更有用。 – Hao

+0

你總是可以upvote;) –

1

你可以做這樣的:

import requests  

headers = {'Host': 'baidu.com', 
      'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0,' 
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
      'Accept-Language': 'en-US,en;q=0.5', 
      'Accept-Encoding': 'gzip, deflate', 
      'Connection': 'keep-alive'} 

requests.get('http://baidu.com/', headers=headers) 
1

requests.request方法(和它的所有衍生像request.getrequest.head)可以傳遞一個headers參數。請參閱requestcustom headers的文檔。

您可以使用它像

requests.get('http://baidu.com', headers={'Host':'baidu.com', 
              'Accept-Encoding': 'gzip, deflate', 
              ...})