與Scrapy

開始

我剛開始Scrapy的文件，我想知道如果任何人都可以通過下面的代碼行的解釋爲我提供了一個合適的線路：與Scrapy

def parse(self, response): 
    filename = response.url.split("/")[-2] + '.html' 
    with open(filename, 'wb') as f: 
     f.write(response.body)

來源

2016-01-07 Rahul baboota

你見過http://doc.scrapy.org/en/stable/intro/tutorial.html#our-first-spider？

parse（）：一個蜘蛛的方法，它將與每個啓動URL的下載的Response對象一起被調用。作爲第一個也是唯一的參數將響應傳遞給方法。

# a method called parse that takes one argument: response 
def parse(self, response): 
    # get the URL (string) from the response object [1] 
    # split [2] the string on the "/" character 
    # generate a filename from the list of split strings 
    filename = response.url.split("/")[-2] + '.html' 
    # open [3] a file called filename and write [4] into it the body 
    # of the response (i.e. the contents of the scraped page) 
    with open(filename, 'wb') as f: 
     f.write(response.body)

[1] http://doc.scrapy.org/en/stable/topics/request-response.html#scrapy.http.Response

[2] https://docs.python.org/2/library/stdtypes.html#str.split

[3] https://docs.python.org/2/library/functions.html#open

[4] https://docs.python.org/2/library/stdtypes.html#file.write

來源

2016-01-07 11:13:11