張量流DNN與兩個隱藏層顯着較慢，併產生比使用scikit流更糟糕的結果

我的生物數據集有20K行和170特徵。我正在做迴歸預測生物活性。（具有線性方程和兩個隱藏層的單元輸出層）。它在我的cpu上運行得非常慢，產生了非常糟糕的r-square（負向）。然後，我用相同的網絡架構與skflow一起運行它。它的速度更快（超過100倍），而且我的r2比之前的運行（r2 = 0.3）要好得多，儘管不是很好的結果。有誰知道爲什麼？我的代碼有什麼問題？我的代碼和底層skflow代碼有什麼區別？我的損失函數是否正確定義？非常感謝幫助。下面是代碼：張量流DNN與兩個隱藏層顯着較慢，併產生比使用scikit流更糟糕的結果

# with scikit flow 
dnn_reg = skflow.TensorFlowDNNRegressor(hidden_units=[200,500], steps=3000, learning_rate=0.5) 
dnn_reg.fit(x_train, y_train) 
pred_train = dnn_reg.predict (x_train) 
pred_valid = dnn_reg.predict (x_valid) 
print ('r-square for training set', r2_score(y_train, pred_train)) 
print ('r-square for validation set',r2_score(y_valid, pred_valid)) 

# tensorflow code 

n_samples = 15000 
n_features = 171 
batch_size = 1000 
num_batch = n_samples/batch_size 
hidden1 = 200 
hidden2 = 100 
learning_rate=0.01 
n_epoch=3000 

graph = tf.Graph() 
with graph.as_default(): 
    #constant and palceholder  
    tf_train_data = tf.placeholder(tf.float32, shape=(batch_size, n_features)) 
    tf_train_act = tf.placeholder(tf.float32, shape=(batch_size)) 
    tf_valid_data=tf.constant (x_valid.astype(np.float32)) 


    # variables 
    w1 = tf.Variable(tf.truncated_normal([n_features, hidden1]), name='weight1') 
    b1 = tf.Variable(tf.zeros([hidden1]), name='bias1') 
    w2 = tf.Variable(tf.truncated_normal([hidden1, hidden2]), name='weight2') 
    b2 = tf.Variable(tf.zeros([hidden2]), name='bias2') 
    w3 = tf.Variable(tf.truncated_normal([hidden2, 1]), name='weight3') 
    b3 = tf.Variable(tf.zeros([1]), name='bias3') 

    #parameter histogram  
    w1_hist = tf.histogram_summary('weight_input', w1) 
    w2_hist = tf.histogram_summary('weight2', w2) 
    w3_hist = tf.histogram_summary('weight3', w3) 
    b1_hist = tf.histogram_summary('bias1', b1) 
    b2_hist = tf.histogram_summary('bias2', b2) 
    b3_hist = tf.histogram_summary('bias3', b3) 
    #y_hist = tf.histogram_summary('y', y_train) 

    #training computation 
    def forward_prop (input): 
     with tf.name_scope('hidden_1') as scope: 
      h1 = tf.nn.relu(tf.matmul(input, w1)+b1) 
     with tf.name_scope('hidden_2') as scope: 
      h2 = tf.nn.relu(tf.matmul(h1, w2)+b2) 
     with tf.name_scope('output') as scope: 
      output = tf.matmul(h2, w3)+b3 
     return (output) 

    #forward propagation 
    output = forward_prop(tf_train_data) 
    with tf.name_scope('cost') as scope: 

     loss=tf.sqrt(tf.reduce_mean(tf.square(tf.sub(tf_train_act, output)))) 
     cost_summary = tf.scalar_summary('cost', loss) 

    #optimizer 
    with tf.name_scope('train') as scope: 
     optimizer = tf.train.AdagradOptimizer(learning_rate).minimize(loss) 

    #predictions 
     train_prediction = output 
     valid_prediction = forward_prop(tf_valid_data) 



with tf.Session(graph=graph) as session: 

    session.run(tf.initialize_all_variables()) 
    print ('initialized') 

    merged = tf.merge_all_summaries() 
    writer = tf.train.SummaryWriter ('./logs/log1', session.graph) 

    for epoch in range(n_epoch): 
     mini = np.array_split(range(y_train.shape[0]), num_batch) 
     for idx in mini[:-1]: 
      batch_x = x_train[idx] 
      batch_y = y_train[idx] 
      feed_dict = {tf_train_data:batch_x, tf_train_act:batch_y} 
      _,l, pred_train = session.run([optimizer, loss, output], feed_dict=feed_dict) 

     if epoch % 100 == 0: 
      print ('minibatch loss at step %d: %f' % (epoch, l)) 
      print ('minibatch r2: %0.1f' % r2_score(batch_y, pred_train)) 
      print ('validation r2: %0.1f' % r2_score(y_valid, valid_prediction.eval()))

來源

2016-06-21 zesla

有很多是你TensorFlowDNNRegressor和香草tensorflow模型之間的不同，包括參數：

hidden2 = 100

learning_rate=0.01

batch_size=1000，默認batch_size爲TensorFlowDNNRegressor是32.我認爲這是TensorFlowDNNRegressor運行很緊固的主要原因河

此外，TensorFlowDNNRegressor使用SGD作爲默認優化器。

來源

2016-06-21 15:49:25

非常感謝你的回答。我確實改變了參數，並且獲得了幾乎相同的速度差。它只花了不到一分鐘的時間。而我的程序需要超過15分鐘。一定還有其他的東西很大。我的損失功能正確嗎？你看到我的代碼有什麼問題嗎？我剛剛檢查過。 TensorFlowDNNRegressor的默認優化器看起來是'Adagrad'。 – zesla

@ j.zheng，你可以嘗試更改爲'loss = tf.reduce_mean（tf.square（output - tf.expand_dims（tf_train_act，dim = [1]）））'？這是[skflow源代碼]中使用的相同損失函數（https://github.com/tensorflow/tensorflow/blob/c43a32d5d0929170a057862e2cd0b59308421444/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py#L773）。 –

@ j.zheng，或者，你可以設置'tf_train_act = tf.placeholder（tf.float32，shape =（batch_size，1））' –

張量流DNN與兩個隱藏層顯着較慢，併產生比使用scikit流更糟糕的結果

回答

相關問題