

  1. 單個對象,牛,從1000。
  2. 隔一個物體。

比較將類似於: 總和(常見詞+常見圖像+ ...)。




什麼時候有兩個共同的圖像? – 2014-10-10 12:59:34


當他們有使用海明距離類似的魯棒哈希。 – schoon 2014-10-10 13:12:09




  1. scores更改爲(N,N)數組而不是(N)。
  2. 添加for j in xrange(N):,從而創建一個雙循環。
  3. if i == j:
  4. break




import numpy as np 

class Thing: 

    def __init__(self, words, images, audios, videos): 
     self.words = words 
     self.images = images 
     self.audios = audios 
     self.videos = videos 

    def compare(self, other): 
     score = 0 
     # Assuming the attribute lists have the same length for both objects 
     # and that they are sorted in the same manner: 
     for i in range(len(self.words)): 
      if self.words[i] == other.words[i]: 
       score += 1 
     for i in range(len(self.images)): 
      if self.images[i] == other.images[i]: 
       score += 1 
     for i in range(len(self.audios)): 
      if self.audios[i] == other.audios[i]: 
       score += 1 
     for i in range(len(self.videos)): 
      if self.videos[i] == other.videos[i]: 
       score += 1 
     # You have to make sure you know what method to use for determining 
     # when an image/audio/video are equal. 
     return score 

N = 1000 
things = [] 
words = np.random.randint(5, size=(N,5)) 
images = np.random.randint(5, size=(N,5)) 
audios = np.random.randint(5, size=(N,5)) 
videos = np.random.randint(5, size=(N,5)) 
# For testing purposes I assign each attribute to a list (array) containing 
# five random integers. I don't know how you actually intend to do it. 
for i in xrange(N): 
    things.append(Thing(words[i], images[i], audios[i], videos[i])) 

############################# This is the new part: ############################ 
scores = np.zeros((N, N)) 
# Scores will become a triangular matrix where scores[i, j]=value means that 
# value is the number of attrributes thing[i] and thing[j] have in common. 
for i in xrange(N): 
    for j in xrange(N): 
     if i == j: 
      # Break the loop here because: 
      # * When i==j we would compare thing[i] with itself, and we don't 
      # want that. 
      # * For every combination where j>i we would repeat all the 
      # comparisons for j<i and create duplicates. We don't want that. 
     scores[i, j] = (things[i].compare(things[j])) 

# I want the 5 most similar pairs: 
n = 5 
# This list will contain a tuple for each of the n most similar pairs: 
best_list = [] 
for k in xrange(n): 
    ij = np.argmax(scores) # Returns a single integer: ij = i*n + j 
    i = ij/N 
    j = ij % N 
    best_list.append((i, j)) 
    # Erease this score so that on next iteration the second largest score 
    # is found: 
    scores[i, j] = 0 

for k, (i, j) in enumerate(best_list): 
    # The number 1 most similar pair is the BEST match of all. 
    # The number N most similar pair is the WORST match of all. 
    print "The number %d most similar pair is thing number %d and %d." \ 
      % (k+1, i, j) 
    print "Thing%4d:" % i, \ 
      things[i].words, things[i].images, things[i].audios, things[i].videos 
    print "Thing%4d:" % j, \ 
      things[j].words, things[j].images, things[j].audios, things[j].videos 

如果這個答案是你想到的,我可以修改它以找到最接近的5對物體。 – PaulMag 2014-10-10 13:43:45


謝謝!一個很大的幫助。 – schoon 2014-10-13 09:15:41


@schoon沒問題。這對你來說是否夠用了?還是我應該擴展它以完全回答第二個問題? – PaulMag 2014-10-13 13:24:30



  1. 把所有對象到一個數組
  2. 計算所有的款項
  3. 排序陣列由總和。
