2017-01-11 58 views
1

我正在對一些數據進行DNN模型的培訓,並希望分析學習的權重以瞭解我正在學習的真實系統(生物學中的信號級聯)。我想可以說我正在使用人工神經網絡來了解生物神經網絡。張量流中的自定義退出

對於我的每個訓練實例,我已刪除的單個基因,即負責在頂部層的信令。

由於我建模這種信號級聯的神經網絡,併除去第一隱藏層的節點之一,我意識到,我正在做輟學的現實生活版。

因此,我要喜歡用退學來訓練我的模型,但我已經在網上看到似乎隨機掉落出一個節點輟學的實現。我需要的是一種指定爲每個訓練示例丟棄哪個節點的方法。

有關如何實現此目的的任何建議?我對任何軟件包都是開放的,但現在我已經完成的所有工作都在Tensorflow中,所以我很欣賞使用該框架的解決方案。

對於那些寧願細節解釋:

我有10個輸入變量,被完全連接到在第一層32個RELU節點,其被完全連接到第二層(RELU),這是完全連接到輸出(線性因爲我正在做迴歸)。

除了10個輸入變量,我還碰巧知道哪28個節點應該退出。

有沒有一種方法可以在培訓時指定?

這是我目前使用的代碼:

num_stresses = 10 
num_kinase = 32 
num_transcription_factors = 200 
num_genes = 6692 

# Build neural network 
# Input variables (10) 
# Which Node to dropout (32) 
stress = tflearn.input_data(shape=[None, num_stresses]) 
kinase_deletion = tflearn.input_data(shape=[None, num_kinase]) 

# This is the layer that I want to perform selective dropout on, 
# I should be able to specify which of the 32 nodes should output zero 
# based on a 1X32 vector of ones and zeros. 
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu') 

transcription_factor = tflearn.fully_connected(kinase, num_transcription_factors, activation='relu') 

gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear') 

adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99) 

regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2') 

# Define model 
model = tflearn.DNN(regression, tensorboard_verbose=1) 

回答

2

我會提供你的輸入變量和所有1的除了你要刪除的一個的大小相等的載體,一個是0

然後,第一個操作應該是將您想要刪除的基因置零。從那裏開始,它應該和現在一樣。交給它來tensorflow或添加其他佔位符,並將其送入圖中feed_dict就像你做你的變量之前

您可以乘(零出你的基因)。後者可能會更好。

如果您需要刪除隱藏節點(在第2層),這是1秒的只是一個載體和0

讓我知道是否可行,或者如果你需要更多的幫助。


編輯: 好了,我還沒有真正與tflearn工作非常多(我只是做定期tensorflow),但我認爲你可以結合tensorflow和tflearn。基本上,我添加了tf.multiply。您可能需要添加另一個tflearn.input_data(shape =[num_stresses])tflearn.input_data(shape =[num_kinase])以爲您提供stresses_dropout_vectorkinase_dropout_vector的佔位符。當然,您可以在這兩個向量中更改零的數量和位置。

import tensorflow as tf ###### New ###### 
import tflearn 

num_stresses = 10 
num_kinase = 32 
num_transcription_factors = 200 
num_genes = 6692 

stresses_dropout_vector = [1] * num_stresses ###### NEW ###### 
stresses_dropout_vector[desired_node_to_drop] = 0 ###### NEW ###### 

kinase_dropout_vector = [1] * num_kinase ###### NEW ###### 
kinase_dropout_vector[desired_hidden_node_to_drop] = 0 ###### NEW ###### 

# Build neural network 
# Input variables (10) 
# Which Node to dropout (32) 
stress = tflearn.input_data(shape=[None, num_stresses]) 
kinase_deletion = tflearn.input_data(shape=[None, num_kinase]) 

# This is the layer that I want to perform selective dropout on, 
# I should be able to specify which of the 32 nodes should output zero 
# based on a 1X32 vector of ones and zeros. 

stress_dropout = tf.multiply(stress, stresses_dropout_vector) ###### NEW ###### Drops out an input 
kinase = tflearn.fully_connected(stress_dropout, num_kinase, activation='relu') ### changed stress to stress_dropout 
kinase_dropout = tf.multiply(kinase, kinase_dropout_vector) ###### NEW ###### Drops out a hidden node 

transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu') ### changed kinase to kinase_dropout 

gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear') 

adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99) 

regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2') 

# Define model 
model = tflearn.DNN(regression, tensorboard_verbose=1) 

如果在tensorflow中添加不起作用,你只需要找到一個普通的舊tflearn。乘法函數做兩個給定張量/向量的元素明智乘法。

希望有所幫助。

+0

「如果你需要刪除一個隱藏節點(在第2層),這是1秒的只是一個載體和0」。 - 這正是我想要做的。那麼我應該在第二層之後創建第二個輸入張量嗎? – kmace

+0

是的,所以你會有兩個佔位符。 ph_inputDropout和ph_layer2Dropout。所以layer2的輸出就像輸入*權重+偏差。只需乘以ph_layer2Dropout 1和0即可。 –

+0

哦,你貼的代碼。我會看看我是否可以編輯它... –

1

爲了完整起見,這裏是我的最終實現:

import numpy as np 
import pandas as pd 
import tflearn 
import tensorflow as tf 

meta = pd.read_csv('../../input/nn/meta.csv') 
experiments = meta["Unnamed: 0"] 
del meta["Unnamed: 0"] 

stress_one_hot = pd.get_dummies(meta["train"]) 

kinase_deletion = pd.get_dummies(meta["Strain"]) 
kinase_one_hot = 1 - kinase_deletion 

expression = pd.read_csv('../../input/nn/data.csv') 
genes = expression["Unnamed: 0"] 
del expression["Unnamed: 0"] # This holds the gene names just so you know... 

expression = expression.transpose() 

# Set up data for tensorflow 
# Gene expression 
target = expression 
target = np.array(expression, dtype='float32') 
target_mean = target.mean(axis=0, keepdims=True) 
target_std = target.std(axis=0, keepdims=True) 
target = target - target_mean 
target = target/target_std 

# Stress information 
data1 = stress_one_hot 
data1 = np.array(data1, dtype='float32') 
data_mean = data1.mean(axis=0, keepdims=True) 
data_std = data1.std(axis=0, keepdims=True) 
data1 = data1 - data_mean 
data1 = data1/data_std 

# Kinase information 
data2 = kinase_one_hot 
data2 = np.array(data2, dtype='float32') 

# For Reference 
# data1.shape 
# #(301, 10) 
# data2.shape 
# #(301, 29) 


# Build the Neural Network 

num_stresses = 10 
num_kinase = 29 
num_transcription_factors = 200 
num_genes = 6692 

# Build neural network 
# Input variables (10) 
# Which Node to dropout (32) 
stress = tflearn.input_data(shape=[None, num_stresses]) 
kinase_deletion = tflearn.input_data(shape=[None, num_kinase]) 

# This is the layer that I want to perform selective dropout on, 
# I should be able to specify which of the 32 nodes should output zero 
# based on a 1X32 vector of ones and zeros. 
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu') 
kinase_dropout = tf.mul(kinase, kinase_deletion) 

transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu') 

gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear') 

adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99) 

regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2') 

# Define model 
model = tflearn.DNN(regression, tensorboard_verbose=1) 

# Start training (apply gradient descent algorithm) 
model.fit([data1, data2], target, n_epoch=20000, show_metric=True, shuffle=True)#,validation_set=0.05)