2
對不起,如果我做了一些愚蠢的事情,但我對這個問題非常困惑:我將一個DataFrame傳遞給一個函數,並在該函數內添加一列並放下它。直到這裏沒什麼奇怪的,但是在函數完成之後,全局名稱範圍的DataFrame顯示添加的&刪除的列。如果我將DF聲明爲全局的,則不會發生這種情況...熊貓DataFrame刪除列再次出現
此測試代碼顯示了由Python 3.3.3/2.7.6和pandas 0.13.0/0.12組合導致的四種情況中的問題0.0:
#!/usr/bin/python
import pandas as pd
# FUNCTION DFcorr
def DFcorr(df):
# Calculate column of accumulated elements
df['SUM']=df.sum(axis=1)
print('DFcorr: DataFrame after add column:')
print(df)
# Drop column of accumulated elements
df=df.drop('SUM',axis=1)
print('DFcorr: DataFrame after drop column:')
print(df)
# FUNCTION globalDFcorr
def globalDFcorr():
global C
# Calculate column of accumulated elements
C['SUM']=C.sum(axis=1)
print('globalDFcorr: DataFrame after add column:')
print(C)
# Drop column of accumulated elements
print('globalDFcorr: DataFrame after drop column:')
C=C.drop('SUM',axis=1)
print(C)
######################### MAIN #############################
C = pd.DataFrame.from_items([('A', [1, 2]), ('B', [3 ,4])], orient='index', columns['one', 'two'])
print('\nMAIN: Initial DataFrame:')
print(C)
DFcorr(C)
print('MAIN: DataFrame after call to DFcorr')
print(C)
C = pd.DataFrame.from_items([('A', [1, 2]), ('B', [3 ,4])], orient='index', columns=['one', 'two'])
print('\nMAIN: Initial DataFrame:')
print(C)
globalDFcorr()
print('MAIN: DataFrame after call to globalDFcorr')
print(C)
在這裏,你是輸出:
MAIN: Initial DataFrame:
one two
A 1 2
B 3 4
[2 rows x 2 columns]
DFcorr: DataFrame after add column:
one two SUM
A 1 2 3
B 3 4 7
[2 rows x 3 columns]
DFcorr: DataFrame after drop column:
one two
A 1 2
B 3 4
[2 rows x 2 columns]
MAIN: DataFrame after call to DFcorr
one two SUM
A 1 2 3
B 3 4 7
[2 rows x 3 columns]
MAIN: Initial DataFrame:
one two
A 1 2
B 3 4
[2 rows x 2 columns]
globalDFcorr: DataFrame after add column:
one two SUM
A 1 2 3
B 3 4 7
[2 rows x 3 columns]
globalDFcorr: DataFrame after drop column:
one two
A 1 2
B 3 4
[2 rows x 2 columns]
MAIN: DataFrame after call to globalDFcorr
one two
A 1 2
B 3 4
[2 rows x 2 columns]
我缺少什麼?非常感謝!
感謝您的答覆。那麼我是否應該明白,當在函數範圍中使用DataFrame標識符(在本例中爲'df')時,它可以任意引用全局變量或本地變量?我的意思是,從答案中,我應該明白'df ['SUM'] = df.sum(axis = 1)''df'影響全局變量,而在'df = df.drop('SUM' ,axis = 1)'和'print(df)'''df'指的是局部變量? – khyox
我在這裏給出了我的推理,以便您可以檢查我是否正確理解了您的答案:如果我在'df'中認爲C/C++指針,則在被調用函數的開頭指向全局作用域中的DataFrame這裏是'C'),當執行'df ['SUM'] = df.sum(axis = 1)'時,它指的是'C',但是在執行'df = df.drop('SUM',axis = 1)',那麼'df'變成指向該函數本地的新DataFrame。這個推理是否正確?非常感謝。 – khyox
@khyox:是的,我認爲你有它!這裏是Python的[通過賦值傳遞](http://stackoverflow.com/a/8140747/190597)函數調用範例的解釋。 – unutbu