2013-07-30 76 views
0

我有兩個清單:prices_distincts,prices循環工作時間太長

他們通過hash_brand_artnum連接,兩者通過hash_brand_artnum 排序我不明白爲什麼循環工作了這麼久:

  1. 如果prices_distincts長度爲100,000它爲30 min

  2. 但如果prices_distincts的長度爲10,000,則適用於10 sec

代碼:

prices_distincts = [{'hash_brand_artnum':1202},...,..] 
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...] 

for prices_distinct in prices_distincts: 
    for price in list(prices):    
     if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']: 

      print price['hash_brand_artnum'] 
      #print prices 
      del prices[0] 
     else: 
      continue 

我需要尋找具有相同價格的項目。 price_distincts和價格之間的關係是一對多關係。並且價格相同['hash_brand_artnum']

+8

它的工作,只要因爲你的算法是O(N^2)和100000^2 =百億和10000^2 = 100000000 所以之間的兩個數因子爲100和30分鐘和10秒之間因子〜100 –

+0

@RomanPekar這應該是一個答案。 – iamnotmaynard

+0

@iamnotmaynard done :) –

回答

10

因爲你的算法是O(N^2)和100000^2 = 10000000000和10000^2 = 100000000,所以它的工作時間很長。所以兩個數之間的因子是100,和30分鐘至10秒〜100之間的因子。

編輯:很難說你的代碼和這麼少的數據,我不知道你的任務是什麼,但我認爲你的字典不是很有用。 可能是試試這個:

>>> prices_distincts = [{'hash_brand_artnum':1202}, {'hash_brand_artnum':14}] 
>>> prices = [{'hash_brand_artnum':1202, 'price':12.077}, {'hash_brand_artnum':14, 'price':15}] 
# turning first list of dicts into simple list of numbers 
>>> dist = [x['hash_brand_artnum'] for x in prices_distincts] 
# turning second list of dicts into dict where number is a key and price is a value 
>>> pr = {x['hash_brand_artnum']:x["price"] for x in prices} 

不是你可以遍歷穿透式你的電話號碼,並得到價格:

>>> for d in dist: 
...  print d, pr[d] 
+0

羅馬,請看看[這篇文章]最後的評論(http://stackoverflow.com/questions/17973049/sql-query-to-select-distinct-rows-from-left-table-after-inner加入到右邊) – Trix

5

正如@RomanPekar提到的,你的算法運行緩慢,因爲它的複雜性是O(n^2)。爲了解決這個問題,你應該把它寫爲O(n)算法:

import itertools as it 

for price, prices_distinct in it.izip(prices, prices_distincts): 
    if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']: 
     # do stuff 
+0

這並沒有真正回答這個問題。問題是「爲什麼?」 –

+0

'itertools.izip'。 –

0

如果價格的增長或多或少與prices_distincts,然後如果你乘以10 prices_distincts的大小,你原來的10秒就會被乘以10再再次減去10(循環的第二個),然後由於「列表(價格)」(順便說一下,應該明確地在循環外完成)減去〜2:

10sec * 10 * 10 * 2 = 2000秒= 33分鐘

該轉換通常很昂貴。

prices_distincts = [{'hash_brand_artnum':1202},...,..] 
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...] 
list_prices = list(prices) 

for prices_distinct in prices_distincts: 
    for price in list_prices:    
     if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']: 

      print price['hash_brand_artnum'] 
      #print prices 
      del prices[0] 
     else: 
      continue