2016-03-26 97 views
2

我有一種將學生鏈接到項目的匹配算法。它正在工作,而且我無法將數據導出到csv文件。只有當需要輸出200個值時,它纔會使用最後的值和輸出。使用熊貓將多行數據導出到csv

導出的數據使用每個數字作爲值,當我想要得到整個's'而不是組成's'的三個數字,它們被分成三列。我附上了下面的圖片。任何幫助,將不勝感激。

What it looks like

What it should look like

#Imports for Pandas 

import pandas as pd 
from pandas import DataFrame 

SPA() 
for m in M: 
    s = m['student'] 
    l = m['lecturer'] 
    Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1 
    id = m['projectid'] 
    p = Project[id]['title'] 
    c = Project[id]['sourceid'] 
    r = str(getRank("Single_Projects1copy.csv",s,c)) 


    print(s+","+l+","+p+","+c+","+r) 

    dataPack = (s+","+l+","+p+","+c+","+r) 

    df = pd.DataFrame.from_records([dataPack]) 
    df.to_csv('try.csv') 

回答

1

你不斷改寫的循環,使你只用數據的最後一位結束了,需要追加到CSV與df.to_csv('try.csv',mode="a",header=False)或創建一個DF和追加並寫在循環之外,如下所示:

df = pd.DataFrame() 
for m in M: 
    s = m['student'] 
    l = m['lecturer'] 
    Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1 
    id = m['projectid'] 
    p = Project[id]['title'] 
    c = Project[id]['sourceid'] 
    r = str(getRank("Single_Projects1copy.csv",s,c)) 


    print(s+","+l+","+p+","+c+","+r) 

    dataPack = (s+","+l+","+p+","+c+","+r) 

    df.append(pd.DataFrame.from_records([dataPack])) 
df.to_csv('try.csv') # write all data once outside the loop 

更好的選擇是打開文件並傳遞該文件對象to_csv

with open('try.csv', 'w') as f: 
    for m in M: 
     s = m['student'] 
     l = m['lecturer'] 
     Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1 
     id = m['projectid'] 
     p = Project[id]['title'] 
     c = Project[id]['sourceid'] 
     r = str(getRank("Single_Projects1copy.csv",s,c)) 
     print(s+","+l+","+p+","+c+","+r) 

     dataPack = (s+","+l+","+p+","+c+","+r) 
     pd.DataFrame.from_records([dataPack]).to_csv(f, header=False) 

你得到個別字符,因爲你用from_records傳遞一個字符串dataPack的值,因此它遍歷的字符:

In [18]: df = pd.DataFrame.from_records(["foobar,"+"bar"]) 

In [19]: df 
Out[19]: 
    0 1 2 3 4 5 6 7 8 9 
0 f o o b a r , b a r 

In [20]: df = pd.DataFrame(["foobar,"+"bar"]) 

In [21]: df 
Out[21]: 
      0 
0 foobar,bar 

我想你基本上要爲離開一個元組dataPack = (s, l, p,c, r)和使用pd.DataFrame(dataPack)。你根本不需要熊貓,csv lib會爲你做所有這些,而不需要創建數據框。

+0

打開文件起作用,它顯示所有學生在csv中的數據。感謝您的意見,非常感謝。在csv中,它跳過標題,但第一列由0組成。我將不得不進行更改以使列結構正確。 – MrPool

+0

我被指示使用熊貓,所以如果將來需要將數據導出到MySQL,它會更容易。 – MrPool

+0

你想使用文件中的csv頭還是創建你自己的 –