2017-01-10 174 views
4

我有一個形狀爲(466,394,1)image,我想分割成7x7塊。將圖像張量分割成小塊

image = tf.placeholder(dtype=tf.float32, shape=[1, 466, 394, 1]) 

使用

image_patches = tf.extract_image_patches(image, [1, 7, 7, 1], [1, 7, 7, 1], [1, 1, 1, 1], 'VALID') 
# shape (1, 66, 56, 49) 

image_patches_reshaped = tf.reshape(image_patches, [-1, 7, 7, 1]) 
# shape (3696, 7, 7, 1) 

遺憾的是在實踐中行不通爲image_patches_reshaped拌和像素順序(如果你查看images_patches_reshaped你只會看到噪聲)。

所以我的新方法是使用tf.split

image_hsplits = tf.split(1, 4, image_resized) 
# [<tf.Tensor 'split_255:0' shape=(462, 7, 1) dtype=float32>,...] 

image_patches = [] 

for split in image_hsplits: 
    image_patches.extend(tf.split(0, 66, split)) 

image_patches 
# [<tf.Tensor 'split_317:0' shape=(7, 7, 1) dtype=float32>, ...] 

這的確保留圖像像素順序遺憾的是它創造了很多的OP這是不是很不錯的。

如何將圖像分割成更小的補丁並使用更少的OP?

UPDATE1:

我移植了answer of this question for numpy到tensorflow:

def image_to_patches(image, image_height, image_width, patch_height, patch_width): 
    height = math.ceil(image_height/patch_height)*patch_height 
    width = math.ceil(image_width/patch_width)*patch_width 

    image_resized = tf.squeeze(tf.image.resize_image_with_crop_or_pad(image, height, width)) 
    image_reshaped = tf.reshape(image_resized, [height // patch_height, patch_height, -1, patch_width]) 
    image_transposed = tf.transpose(image_reshaped, [0, 2, 1, 3]) 
    return tf.reshape(image_transposed, [-1, patch_height, patch_width, 1]) 

,但我認爲還是有改進的餘地。

UPDATE2:

這將轉換補丁回到原始圖像。

def patches_to_image(patches, image_height, image_width, patch_height, patch_width): 
    height = math.ceil(image_height/patch_height)*patch_height 
    width = math.ceil(image_width/patch_width)*patch_width 

    image_reshaped = tf.reshape(tf.squeeze(patches), [height // patch_height, width // patch_width, patch_height, patch_width]) 
    image_transposed = tf.transpose(image_reshaped, [0, 2, 1, 3]) 
    image_resized = tf.reshape(image_transposed, [height, width, 1]) 
    return tf.image.resize_image_with_crop_or_pad(image_resized, image_height, image_width) 

回答

4

我認爲你的問題是在別的地方。我寫了下面的代碼片段(使用更小的14×14的圖像,這樣我可以手工檢查所有的值),並確認您最初的代碼做了正確的操作:

import tensorflow as tf 
import numpy as np 

IMAGE_SIZE = [1, 14, 14, 1] 
PATCH_SIZE = [1, 7, 7, 1] 

input_image = np.reshape(np.array(xrange(14*14)), IMAGE_SIZE) 
image = tf.placeholder(dtype=tf.int32, shape=IMAGE_SIZE) 
image_patches = tf.extract_image_patches(
    image, PATCH_SIZE, PATCH_SIZE, [1, 1, 1, 1], 'VALID') 
image_patches_reshaped = tf.reshape(image_patches, [-1, 7, 7, 1]) 

sess = tf.Session() 

(output, output_reshaped) = sess.run(
    (image_patches, image_patches_reshaped), 
    feed_dict={image: input_image}) 

print "Output (shape: %s):" % (output.shape,) 
print output 

print "Reshaped (shape: %s):" % (output_reshaped.shape,) 
print output_reshaped 

產量爲:

python resize.py 
Output (shape: (1, 2, 2, 49)): 
[[[[ 0 1 2 3 4 5 6 14 15 16 17 18 19 20 28 29 30 31 
    32 33 34 42 43 44 45 46 47 48 56 57 58 59 60 61 62 70 
    71 72 73 74 75 76 84 85 86 87 88 89 90] 
    [ 7 8 9 10 11 12 13 21 22 23 24 25 26 27 35 36 37 38 
    39 40 41 49 50 51 52 53 54 55 63 64 65 66 67 68 69 77 
    78 79 80 81 82 83 91 92 93 94 95 96 97]] 

    [[ 98 99 100 101 102 103 104 112 113 114 115 116 117 118 126 127 128 129 
    130 131 132 140 141 142 143 144 145 146 154 155 156 157 158 159 160 168 
    169 170 171 172 173 174 182 183 184 185 186 187 188] 
    [105 106 107 108 109 110 111 119 120 121 122 123 124 125 133 134 135 136 
    137 138 139 147 148 149 150 151 152 153 161 162 163 164 165 166 167 175 
    176 177 178 179 180 181 189 190 191 192 193 194 195]]]] 
Reshaped (shape: (4, 7, 7, 1)): 
[[[[ 0] 
    [ 1] 
    [ 2] 
    [ 3] 
    [ 4] 
    [ 5] 
    [ 6]] 

    [[ 14] 
    [ 15] 
    [ 16] 
    [ 17] 
    [ 18] 
    [ 19] 
    [ 20]] 

    [[ 28] 
    [ 29] 
    [ 30] 
    [ 31] 
    [ 32] 
    [ 33] 
    [ 34]] 

    [[ 42] 
    [ 43] 
    [ 44] 
    [ 45] 
    [ 46] 
    [ 47] 
    [ 48]] 

    [[ 56] 
    [ 57] 
    [ 58] 
    [ 59] 
    [ 60] 
    [ 61] 
    [ 62]] 

    [[ 70] 
    [ 71] 
    [ 72] 
    [ 73] 
    [ 74] 
    [ 75] 
    [ 76]] 

    [[ 84] 
    [ 85] 
    [ 86] 
    [ 87] 
    [ 88] 
    [ 89] 
    [ 90]]] 


[[[ 7] 
    [ 8] 
    [ 9] 
    [ 10] 
    [ 11] 
    [ 12] 
    [ 13]] 

    [[ 21] 
    [ 22] 
    [ 23] 
    [ 24] 
    [ 25] 
    [ 26] 
    [ 27]] 

    [[ 35] 
    [ 36] 
    [ 37] 
    [ 38] 
    [ 39] 
    [ 40] 
    [ 41]] 

    [[ 49] 
    [ 50] 
    [ 51] 
    [ 52] 
    [ 53] 
    [ 54] 
    [ 55]] 

    [[ 63] 
    [ 64] 
    [ 65] 
    [ 66] 
    [ 67] 
    [ 68] 
    [ 69]] 

    [[ 77] 
    [ 78] 
    [ 79] 
    [ 80] 
    [ 81] 
    [ 82] 
    [ 83]] 

    [[ 91] 
    [ 92] 
    [ 93] 
    [ 94] 
    [ 95] 
    [ 96] 
    [ 97]]] 


[[[ 98] 
    [ 99] 
    [100] 
    [101] 
    [102] 
    [103] 
    [104]] 

    [[112] 
    [113] 
    [114] 
    [115] 
    [116] 
    [117] 
    [118]] 

    [[126] 
    [127] 
    [128] 
    [129] 
    [130] 
    [131] 
    [132]] 

    [[140] 
    [141] 
    [142] 
    [143] 
    [144] 
    [145] 
    [146]] 

    [[154] 
    [155] 
    [156] 
    [157] 
    [158] 
    [159] 
    [160]] 

    [[168] 
    [169] 
    [170] 
    [171] 
    [172] 
    [173] 
    [174]] 

    [[182] 
    [183] 
    [184] 
    [185] 
    [186] 
    [187] 
    [188]]] 


[[[105] 
    [106] 
    [107] 
    [108] 
    [109] 
    [110] 
    [111]] 

    [[119] 
    [120] 
    [121] 
    [122] 
    [123] 
    [124] 
    [125]] 

    [[133] 
    [134] 
    [135] 
    [136] 
    [137] 
    [138] 
    [139]] 

    [[147] 
    [148] 
    [149] 
    [150] 
    [151] 
    [152] 
    [153]] 

    [[161] 
    [162] 
    [163] 
    [164] 
    [165] 
    [166] 
    [167]] 

    [[175] 
    [176] 
    [177] 
    [178] 
    [179] 
    [180] 
    [181]] 

    [[189] 
    [190] 
    [191] 
    [192] 
    [193] 
    [194] 
    [195]]]] 

根據整形後的輸出,可以看到它是一個4x7x7x1,其中第一個補丁值爲[0-7],[14-21],[28-35],[42-49],[56 -63),[70-77]和[84-91),其對應於左上方的7×7網格。

也許你可以解釋一下,當它不能正常工作時會發生什麼?

+0

這是一個有趣的觀點!我會稍後再研究。在我的模型中,我認爲在重塑步驟中必然存在腐敗,因爲當我檢查「修補程序」時,他們看起來不正確,但我無法保證。 – bodokaiser

+0

嘿,你是對的。現在只用原始數據進行測試。不知道之前出了什麼問題.. – bodokaiser