2015-01-14 14 views
3

我試圖使用我發現的代碼來實現LeCun局部對比度標準化,但我得到不正確的結果。代碼使用Python並使用theano庫。使用Theano實現LeCun局部對比度標準化

def lecun_lcn(input, img_shape, kernel_shape, threshold=1e-4): 
    """ 
    Yann LeCun's local contrast normalization 
    Orginal code in Theano by: Guillaume Desjardins 
    """ 
    input = input.reshape(input.shape[0], 1, img_shape[0], img_shape[1]) 
    X = T.matrix(dtype=theano.config.floatX) 
    X = X.reshape(input.shape) 

    filter_shape = (1, 1, kernel_shape, kernel_shape) 
    filters = gaussian_filter(kernel_shape).reshape(filter_shape) 

    convout = conv.conv2d(input=X, 
          filters=filters, 
          image_shape=(input.shape[0], 1, img_shape[0], img_shape[1]), 
          filter_shape=filter_shape, 
          border_mode='full') 

    # For each pixel, remove mean of 9x9 neighborhood 

    mid = int(np.floor(kernel_shape/2.)) 
    centered_X = X - convout[:, :, mid:-mid, mid:-mid] 
    # Scale down norm of 9x9 patch if norm is bigger than 1 
    sum_sqr_XX = conv.conv2d(input=centered_X ** 2, 
          filters=filters, 
          image_shape=(input.shape[0], 1, img_shape[0], img_shape[1]), 
          filter_shape=filter_shape, 
          border_mode='full') 

    denom = T.sqrt(sum_sqr_XX[:, :, mid:-mid, mid:-mid]) 
    per_img_mean = denom.mean(axis=[1, 2]) 
    divisor = T.largest(per_img_mean.dimshuffle(0, 'x', 'x', 1), denom) 
    divisor = T.maximum(divisor, threshold) 

    new_X = centered_X/divisor 
    new_X = new_X.dimshuffle(0, 2, 3, 1) 
    new_X = new_X.flatten(ndim=3) 

    f = theano.function([X], new_X) 
    return f(input) 

下面是測試代碼:

x_img_origin = plt.imread("..//data//Lenna.png") 
x_img = plt.imread("..//data//Lenna.png") 
x_img_real_result = plt.imread("..//data//Lenna_Processed.png") 

x_img = x_img.reshape(1, x_img.shape[0], x_img.shape[1], x_img.shape[2]) 
for d in range(3): 
    x_img[:, :, :, d] = tools.lecun_lcn(x_img[:, :, :, d], (x_img.shape[1], x_img.shape[2]), 9) 
x_img = x_img[0] 

pylab.subplot(1, 3, 1); pylab.axis('off'); pylab.imshow(x_img_origin) 
pylab.gray() 
pylab.subplot(1, 3, 2); pylab.axis('off'); pylab.imshow(x_img) 
pylab.subplot(1, 3, 3); pylab.axis('off'); pylab.imshow(x_img_real_result) 
pylab.show() 

下面是結果:

Example image processing results

(從左到右依次爲:原產地,我的結果,預期結果)

有人可以tel我是什麼我做了錯誤的代碼?

+0

我做了一次類似的東西,但遺憾的是不能運行的代碼,因爲它不是獨立的。你可以讓它可運行,從而調試?除其他外,您需要指定您正在使用的高斯濾波器功能。 – eickenberg

+0

嗨eickenberg, 這裏是代碼,請根據需要更改圖像路徑。 http://pastebin.com/x6WREp7D 以下是圖像: http://upload.wikimedia.org/wikipedia/en/2/24/Lenna.png 讓我知道你需要什麼。我認爲門檻是罪魁禍首。如果我增加閾值,它就會變得與預期的結果更接近。 –

回答

1

我覺得這兩條線路可能有一些錯誤的矩陣軸:

per_img_mean = denom.mean(axis=[1, 2]) 
divisor = T.largest(per_img_mean.dimshuffle(0, 'x', 'x', 1), denom) 

,它應該被改寫爲:

per_img_mean = denom.mean(axis=[2, 3]) 
divisor = T.largest(per_img_mean.dimshuffle(0, 1, 'x', 'x'), denom) 
6

這裏是我是如何實現局部對比度正常化的報道Jarrett等人(http://yann.lecun.com/exdb/publis/pdf/jarrett-iccv-09.pdf)。您可以將其用作單獨的圖層。

我對theano的LeNet教程中的代碼進行了測試,在該教程中,我將LCN應用於輸入和每個卷積層,產生稍好的結果。

你可以在這裏找到完整的代碼: https://github.com/jostosh/theano_utils/blob/master/lcn.py

class LecunLCN(object): 
def __init__(self, X, image_shape, threshold=1e-4, radius=9, use_divisor=True): 
    """ 
    Allocate an LCN. 

    :type X: theano.tensor.dtensor4 
    :param X: symbolic image tensor, of shape image_shape 

    :type image_shape: tuple or list of length 4 
    :param image_shape: (batch size, num input feature maps, 
         image height, image width) 
    :type threshold: double 
    :param threshold: the threshold will be used to avoid division by zeros 

    :type radius: int 
    :param radius: determines size of Gaussian filter patch (default 9x9) 

    :type use_divisor: Boolean 
    :param use_divisor: whether or not to apply divisive normalization 
    """ 

    # Get Gaussian filter 
    filter_shape = (1, image_shape[1], radius, radius) 

    self.filters = theano.shared(self.gaussian_filter(filter_shape), borrow=True) 

    # Compute the Guassian weighted average by means of convolution 
    convout = conv.conv2d(
     input=X, 
     filters=self.filters, 
     image_shape=image_shape, 
     filter_shape=filter_shape, 
     border_mode='full' 
    ) 

    # Subtractive step 
    mid = int(numpy.floor(filter_shape[2]/2.)) 

    # Make filter dimension broadcastable and subtract 
    centered_X = X - T.addbroadcast(convout[:, :, mid:-mid, mid:-mid], 1) 

    # Boolean marks whether or not to perform divisive step 
    if use_divisor: 
     # Note that the local variances can be computed by using the centered_X 
     # tensor. If we convolve this with the mean filter, that should give us 
     # the variance at each point. We simply take the square root to get our 
     # denominator 

     # Compute variances 
     sum_sqr_XX = conv.conv2d(
      input=T.sqr(centered_X), 
      filters=self.filters, 
      image_shape=image_shape, 
      filter_shape=filter_shape, 
      border_mode='full' 
     ) 


     # Take square root to get local standard deviation 
     denom = T.sqrt(sum_sqr_XX[:,:,mid:-mid,mid:-mid]) 

     per_img_mean = denom.mean(axis=[2,3]) 
     divisor = T.largest(per_img_mean.dimshuffle(0, 1, 'x', 'x'), denom) 
     # Divisise step 
     new_X = centered_X/T.maximum(T.addbroadcast(divisor, 1), threshold) 
    else: 
     new_X = centered_X 

    self.output = new_X 


def gaussian_filter(self, kernel_shape): 
    x = numpy.zeros(kernel_shape, dtype=theano.config.floatX) 

    def gauss(x, y, sigma=2.0): 
     Z = 2 * numpy.pi * sigma ** 2 
     return 1./Z * numpy.exp(-(x ** 2 + y ** 2)/(2. * sigma ** 2)) 

    mid = numpy.floor(kernel_shape[-1]/2.) 
    for kernel_idx in xrange(0, kernel_shape[1]): 
     for i in xrange(0, kernel_shape[2]): 
      for j in xrange(0, kernel_shape[3]): 
       x[0, kernel_idx, i, j] = gauss(i - mid, j - mid) 

    return x/numpy.sum(x) 
+0

如果你還添加了一個如何調用函數的例子,那真的很酷。我對這些方法很陌生,從調用LecunLCN得到的結果是'Elemwise {true_div,no_inplace} .0',我不知道它是否正確以及如何從中提取圖像。 –

+0

對於那些對theano一無所知的人,函數的輸出是一個對象,它尚未通過評估對象進行編譯。因此,如果你想提取卷積圖像,你必須編寫'LecunLCN(X).output.eval()[0] [0]' –