使用seek和tell功能無法正常工作,因爲tell以字節爲單位返回當前位置;我需要獲取行號而不是文件指針的位置。從python中的文件中讀取特定的元組
我有一個文件glass.csv
,我需要聚集數據集。文件中的每一行包含了一些1,2,3...
像下面:
65,1.52172,13.48,3.74,0.90,72.01,0.18,9.61,0.00,0.07,1
66,1.52099,13.69,3.59,1.12,71.96,0.09,9.40,0.00,0.00,1
67,1.52152,13.05,3.65,0.87,72.22,0.19,9.85,0.00,0.17,1
68,1.52152,13.05,3.65,0.87,72.32,0.19,9.85,0.00,0.17,1
69,1.52152,13.12,3.58,0.90,72.20,0.23,9.82,0.00,0.16,1
70,1.52300,13.31,3.58,0.82,71.99,0.12,10.17,0.00,0.03,1
71,1.51574,14.86,3.67,1.74,71.87,0.16,7.36,0.00,0.12,2
72,1.51848,13.64,3.87,1.27,71.96,0.54,8.32,0.00,0.32,2
73,1.51593,13.09,3.59,1.52,73.10,0.67,7.83,0.00,0.00,2
74,1.51631,13.34,3.57,1.57,72.87,0.61,7.89,0.00,0.00,2
142,1.51851,13.20,3.63,1.07,72.83,0.57,8.41,0.09,0.17,2
143,1.51662,12.85,3.51,1.44,73.01,0.68,8.23,0.06,0.25,2
144,1.51709,13.00,3.47,1.79,72.72,0.66,8.18,0.00,0.00,2
145,1.51660,12.99,3.18,1.23,72.97,0.58,8.81,0.00,0.24,2
146,1.51839,12.85,3.67,1.24,72.57,0.62,8.68,0.00,0.35,2
147,1.51769,13.65,3.66,1.11,72.77,0.11,8.60,0.00,0.00,3
148,1.51610,13.33,3.53,1.34,72.67,0.56,8.33,0.00,0.00,3
149,1.51670,13.24,3.57,1.38,72.70,0.56,8.44,0.00,0.10,3
150,1.51643,12.16,3.52,1.35,72.89,0.57,8.53,0.00,0.00,3
我需要從具有1
作爲最後一個數字的元組需要一定的投入,並將其保存在另一個文件中,(train.txt
),並在剩餘另一個文件,(test.txt
)。同樣,我需要從2
作爲最後一個號碼,並追加到第一個文件,即train.txt
和其餘test.txt
。
我不能得到第二個輸入,而是追加第一個結果本身。
上面完全沒有要求將每個文件的70%放入一個文件中,並將30%放入另一個文件中。此外,它是否必須是第一個70%,在這種情況下,您需要首先對它們進行計數,或者每個10箇中的前7個足夠接近? – 2014-11-06 20:19:48
請通過這個鏈接 - > archive.ics.uci.edu/ml/machine-learning-databases/glass/......這是我的數據集,正是這個,我提到了70-30分裂。每個元組以1或2.etc結尾..我需要將第70個存儲到train.txt,剩餘30個存放到test.txt中。此後,隨後檢索2,3個元組作爲最後一個值相同的70-30基礎需要被附加到上述文件..希望這使得我的問題具體 – Devi 2014-11-07 08:27:11