2017-05-11 74 views
0

我正在處理3個數據集,我寫了3個不同的函數,每個數據集1個做一些數據清理和操作。最後,我想將所有3個已清理的數據集合在另一個函數中。使用來自不同函數的多個返回數據集python pandas

我的邏輯:

import pandas as pd 
def function1(): 
    read in data as df 
    df[(df.column1 != "")&(df.column2 != 'MRN')&(df.column3 != "C") ] 
    return data1.to_csv() 

def function2(): 
    read in data as df 
    df[(df.column1 != "A")&(df.column2 != 'M')&(df.column3 != " ") ] 
    return data2.to_csv() 

def function3(): 
    read in data as df 
    df[(df.column1 != "B")&(df.column2 != 'N')&(df.column3 != " ") ] 
    return data3.to_csv() 

def combinedatasets(): 
    merge (data1, data2, data3) 
    return combineddata.to_csv() 

現在我輸出數據1,數據2和數據3爲目錄中的新文件。無論如何將它們暫時存儲在腳本中,因此這3個文件不會被輸出,只有combineddate.csv會被輸出? 如何從我的combineddatasets函數中的前3個函數中調用這些臨時數據集data1,data2,data3來合併它們?

所以像:

import pandas as pd 
def function1(): 
    read in data as df 
    df[(df.column1 != "")&(df.column2 != 'MRN')&(df.column3 != "C") ] 
    return temporary data1 without outputting it 

def function2(): 
    read in data as df 
    df[(df.column1 != "A")&(df.column2 != 'M')&(df.column3 != " ") ] 
    return temporary data2 without outputting it 

def function3(): 
    read in data as df 
    df[(df.column1 != "B")&(df.column2 != 'N')&(df.column3 != " ") ] 
    return temporary data3 without outputting it 

def combinedatasets(): 
    calling temporary data1,2,3 and 
    merge (data1, data2, data3) 
    return pd.to_csv('combineddata.csv') #output as a csv file 

所以纔有了 'combineddata.csv' 將被輸出到文件夾中。

def myfunction(): 
    data = pd.read_csv('Input.csv') 
    # process dataframe... 
    return data 

def combinedatasets(): 
    df = myfunction() 

或同時分配:

+0

什麼你的意思是*結合*嗎?追加他們,合併他們還是其他?你也可以不顯示僞代碼,因爲它很難看到你的問題,因爲'read_csv'不是任何其他對象的方法,除非是一般的'pandas'方法。你的意思是'to_csv'? – Parfait

+0

是啊我正在使用熊貓,讓我編輯 – Jessica

回答

1

因爲函數返回一個數據幀簡單地分配對象以函數調用

def combinedatasets(): 
    data1, data2, data3 = function1(), function2(), function3() 

然而,避免在您的環境類似結構的多個dataframes並保存dataframes到然後您可以合併或附加在一起的列表:

def combinedatasets(): 
    dfList = [function1(), function2(), function3()] 

    # MERGE/COLUMN BIND 
    combinedf = pd.concat(dfList, axis=1, join_axes=[dfList[0].index]) 
    combinedf.to_csv('CombinedWideData.csv') 

    # APPEND/ROW BIND 
    combinedf = pd.concat(dfList) 
    combinedf.to_csv('CombinedLongData.csv') 
相關問題