2016-07-06 51 views
2

可以說我有一個重複索引的稀疏張量,它們是重複的我想合併值(總結起來) 這樣做的最佳方式是什麼?合併稀疏張量中的重複索引

例如:

indicies = [[1, 1], [1, 2], [1, 2], [1, 3]] 
values = [1, 2, 3, 4] 

object = tf.SparseTensor(indicies, values, shape=[10, 10]) 

result = tf.MAGIC(object) 

結果應符合下列值備用張量(或混凝土!):

indicies = [[1, 1], [1, 2], [1, 3]] 
values = [1, 5, 4] 

我雖然的唯一一件事就是字符串連接的indicies在一起創建一個索引散列將其應用於第三維,然後減少該第三維上的總和。

indicies = [[1, 1, 11], [1, 2, 12], [1, 2, 12], [1, 3, 13]] 
sparse_result = tf.sparse_reduce_sum(sparseTensor, reduction_axes=2, keep_dims=true) 

但是,這感覺非常非常難看

回答

3

下面是使用tf.segment_sum的解決方案。這個想法是將指數線性化到1-D空間,獲得唯一索引tf.unique,運行tf.segment_sum,並將索引轉換回N-D空間。

indices = tf.constant([[1, 1], [1, 2], [1, 2], [1, 3]]) 
values = tf.constant([1, 2, 3, 4]) 

# Linearize the indices. If the dimensions of original array are 
# [N_{k}, N_{k-1}, ... N_0], then simply matrix multiply the indices 
# by [..., N_1 * N_0, N_0, 1]^T. For example, if the sparse tensor 
# has dimensions [10, 6, 4, 5], then multiply by [120, 20, 5, 1]^T 
# In your case, the dimensions are [10, 10], so multiply by [10, 1]^T 

linearized = tf.matmul(indices, [[10], [1]]) 

# Get the unique indices, and their positions in the array 
y, idx = tf.unique(tf.squeeze(linearized)) 

# Use the positions of the unique values as the segment ids to 
# get the unique values 
values = tf.segment_sum(values, idx) 

# Go back to N-D indices 
y = tf.expand_dims(y, 1) 
indices = tf.concat([y//10, y%10], axis=1) 

tf.InteractiveSession() 
print(indices.eval()) 
print(values.eval()) 
+0

這比我想象的要漂亮多了 – dtracers