如何定義數據以適合分類器

我是tensorflow的新手。我創建了一個204x4的矩陣，其中前三個柱子是特徵，最後一個柱子是目標。我如何需要轉換數組以便tensorflow可以訓練數據？如何定義數據以適合分類器

TRAINING_SET = np.asarray(seq[:llength]) 
VALIDATION_SET= np.asarray(seq[llength:llength+tlength]) 
TEST_SET = np.asarray(seq[llength+tlength:]) 
num_epochs=100 
batch_size = 32 
featureColumns = np.shape(TRAINING_SET)[1] 

# define a function to get data as batch, you can use this function for test and validation also by simply changing shuffle=False and replacing tf.train.shuffle_batch as tf.train.batch 
def data_input_fn(trainset, batch_size, num_epochs, toShuffle): 
    data_f = trainset[:, :(featureColumns-1)] 
    data_l = trainset[:, (featureColumns-1)] 
    data_f_single, data_l_single = tf.train.slice_input_producer([data_f, data_l], num_epochs=num_epochs, shuffle=toShuffle) 

    if toShuffle is True: 
     data_f_batch, data_l_batch = tf.train.shuffle_batch([data_f_single, data_l_single], batch_size=batch_size, capacity=400, min_after_dequeue=2*batch_size) 
    else: 
     data_f_batch, data_l_batch = tf.train.batch([data_f_single, data_l_single], batch_size=batch_size, capacity=400, min_after_dequeue=2*batch_size) 

    return data_f_batch, data_l_batch 

def main(): 

    # Specify that all features have real-value data 
    feature_columns = [tf.contrib.layers.real_valued_column("", dimension=3)] 

    # Build 3 layer DNN with 10, 20, 10 units respectively. 
    classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns, 
               hidden_units=[10, 20, 10], 
               n_classes=10, 
               model_dir="/tmp/iris_model") 

    # Fit model. 
    classifier.fit(input_fn=lambda: data_input_fn(TRAINING_SET, batch_size, num_epochs, True), steps=4000) 

    # Evaluate accuracy. 
    accuracy_test_score = classifier.evaluate(input_fn=lambda: data_input_fn(VALIDATION_SET, batch_size, num_epochs, False), 
             steps=1)["accuracy"] 

    accuracy_validation_score = classifier.evaluate(input_fn=lambda: data_input_fn(TEST_SET, batch_size, num_epochs, False), 
             steps=1)["accuracy"] 

    print ("\nValidation Accuracy: {0:0.2f}\nTest Accuracy: {1:0.2f}\n".format(accuracy_validation_score,accuracy_test_score)) 

    # Classify two new flower samples. 
    def new_samples(): 
    return np.array(
     [[327,8,3], 
     [47,8,0]], dtype=np.float32) 

    predictions = list(classifier.predict_classes(input_fn=new_samples))

給

類型錯誤： '張量' 對象不是可調用

來源

2017-08-05 Chris

您需要使用的功能爲input_fn不僅僅是tensor

TRAINING_SET = np.asarray(seq[:llength]) 
VALIDATION_SET= np.asarray(seq[llength:llength+tlength]) 
TEST_SET = np.asarray(seq[llength+tlength:]) 
num_epochs=100 
batch_size = 32 
# define a function to get data as batch, you can use this function for test and validation also by simply changing shuffle=False and replacing tf.train.shuffle_batch as tf.train.batch 
def data_input_fn(trainset, batch_size, num_epochs): 
    data_f = trainset[:, :3] 
    data_l = trainset[:, 3] 
    data_f_single, data_l_single = tf.train.slice_input_producer([data_f, data_l], num_epochs=num_epochs, shuffle=True) 
    data_f_batch, data_l_batch = tf.train.shuffle_batch([data_f_single, data_l_single], batch_size=batch_size, capacity=400, min_after_dequeue=2*batch_size) 
    return data_f_batch, data_l_batch 

# use this function as input_fn to fit 
classifier.fit(input_fn=lambda: data_input_fn(TRAINING_SET, batch_size, num_epochs), steps=4000)

來源

2017-08-05 21:03:04

真棒。你能否詳細說明num_epocs和batch_size參數的用途？我看到有shuffle = True參數。我想我不必隨機混洗矩陣，然後 – Chris

'Batch_size'是優化器將用於優化參數的每個小批量的樣本數。 'num_epochs'表示您提供訓練相同數據集的次數。一個時期對應於'num_training_samples/batch_size'次數的迭代;基本上一個前進和後退傳遞所有訓練樣本。 –

我修改了我的問題中的代碼。我得到以下錯誤'''InvalidArgumentError（見上面回溯）：tensor_name = dnn/logits/biasses/dnn/logits/biases/part_0/Adagrad; shape_and_slice spec [10]與存儲在檢查點中的形狀不匹配：[9] \t [[Node：save/RestoreV2_13 = RestoreV2 [dtypes = [DT_FLOAT]，_device =「/ job：localhost/replica：0/task ：0/cpu：0「]（_ arg_save/Const_0_0，save/RestoreV2_13/tensor_names，save/RestoreV2_13/shape_and_slices）]]'' – Chris

如何定義數據以適合分類器

回答

相關問題