2016-02-21 145 views
-2

有沒有一種方法可以多線程的功能,從一次只能從列表中的5個URL?請參閱下面的代碼。其python 2.7多線程python

import requests, csv, time, json, threading 
from lxml import html 
from csv import DictWriter 

All_links = ['http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.343097&longitude=-71.123046&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.398588&longitude=-71.24505&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.394319&longitude=-71.218049&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.365396&longitude=-71.23165&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.356719&longitude=-71.250479&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.385096&longitude=-71.208399&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.334146&longitude=-71.183298&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA', 
'http://www.clopaydoor.com/api/v1/dealerlocator/getdealers?latitude=42.374296&longitude=-71.182371&doorType=residential&isFirstSearch=true&isHomeDepot=false&isClopayDealer=true&radius=3000&country=USA'] 

target = open('completedlinks.txt','ab') 
def get_data(each): 
    each = each.strip('\n') 
    r = requests.get(each) 
    source = json.loads(r.content) 
    the_file = open("output.csv", "ab") 
    writer = DictWriter(the_file, source[1].keys()) 
    writer.writeheader() 
    writer.writerows(source) 
    the_file.close() 
    target.write(each+'\n') 
    print each+"\n--------------------------" 


for each in All_links: 
    try: 
     get_data(each) 
    except: 
     pass 

回答

0

查看multiprocessing package。它實現了線程池,可以完成這個任務。

更新: 添加這樣的事情應該工作

from multiprocessing import Pool 

def chunks(l, n): 
""" Yield successive n-sized chunks from l. """ 
    for i in xrange(0, len(l), n): 
     yield l[i:i+n] 

def threadit(threads, links): 
    for part in chunks(links, threads): 
     pool = Pool(threads) 
     for link in part: 
      pool.apply_async(getdata, args=(link,)) 
     pool.close() 
     pool.join() 

threadit(5, All_links) 
+0

我無法弄清楚,如何線程限制爲「N」的數字,你可以請使用上面的代碼顯示? –