序列化，在pyBrain分類，機器學習，預測

我有訓練數據的這樣的例子（一具有1000層膜進行訓練），我需要預測每個膜的「預算」：序列化，在pyBrain分類，機器學習，預測

film_1 = { 
    'title': 'The Hobbit: An Unexpected Journey', 
    'article_size': 25000, 
    'producer': ['Peter Jackson', 'Fran Walsh', 'Zane Weiner'], 
    'release_date': some_date(2013, 11, 28), 
    'running_time': 169, 
    'country': ['New Zealand', 'UK', 'USA'], 
    'budget': dec('200000000') 
}

的可將諸如'title','producer','country'之類的密鑰視爲機器學習中的特徵，而諸如'The Hobbit: An Unexpected Journey',25000等的值可被視爲用於學習過程的值。然而，在訓練中，輸入大部分被接受爲實數而不是字符串格式。我是否需要將'title','producer','country'（字段是字符串）這樣的字段轉換爲int（類似分類或序列化之類的事情應該發生？）還是其他一些操作，以使我能夠將這些數據用作我的訓練集網絡？

來源

2013-12-08 smith

我想知道這是否是你所需要的：

film_list=['title','article_size','producer','release_date','running_time','country','budget'] 
flist = [(i,j) for i, j in enumerate(film_list)] 
label = [ seq[0] for seq in flist ] 
name = [ seq[1] for seq in flist ] 
print label 
print name 

>>[0, 1, 2, 3, 4, 5, 6] 
['title', 'article_size', 'producer', 'release_date', 'running_time', 'country', 'budget']

或者你可以直接用你的詞典，

labels = film_1.keys() 
print labels 

# But the keys are sorted, labels[0] will give you 'producer' instead of 'title': 
>>['producer', 'title', 'country', 'release_date', 'budget', 'article_size', 'running_time']

來源

2013-12-09 01:20:51 lennon310

謝謝！但我需要這個：[1,2,3,4,5,6,7]或許 – smith

定義一個新的label1 = label + 1.然後每個數字將被映射到film_list – lennon310

序列化，在pyBrain分類，機器學習，預測

回答

相關問題