Python的大熊貓：無法撼動SettingWithCopyWarning錯誤，甚至是問心無愧的.loc使用

我已經檢查了這些存在的問題：Python的大熊貓：無法撼動SettingWithCopyWarning錯誤，甚至是問心無愧的.loc使用

...但我不但仍然充分了解這個問題。

我想寫一個模塊來匹配字符串，通過逐步轉換它們在源和目標上，並檢查其他匹配。爲了跟蹤重複的變換/匹配嘗試，我使用數據幀來源，目標和匹配。

所以部分解決方案是爲尚未匹配的項目創建源/目標子集，應用轉換並提取所有匹配結果。所以我的代碼如下所示：

import pandas as pd 

def trymatch(transformers): 

    global matches, source, target 

    # Don't bother doing work if we've already found a match 
    if matches is not None: 
     s_ids = matches['id_s'].values 
     s_inmask = (~source['id'].isin(s_ids)) 
     s = source.loc[s_inmask].copy() 
     # ... do the same for the target dataframe 
    else: 
     s = source 
     t = target 

    for transformer in transformers: 
     # Call the transformations here... 

    mnew = pd.merge(s, t, on='matchval', suffixes=['_s', '_t']) 

    if matches is None: matches = mnew 
    else: matches = matches.append(mnew) 

# ---------------------------------------------------------------------------------------------------------------------- 

source = pd.DataFrame({'id': [1, 2, 3], 'value': ['a', 'b', 'c']}) 
target = pd.DataFrame({'id': [4, 5, 6], 'value': ['A', 'b', 'd']}) 

matches = None 
trymatch(['t_null']) 
trymatch(['t_upper'])

我的挑戰來自trymatch函數，如果匹配已經存在，我創建子集。即使使用.loc索引，Python也會向我投擲SettingWithCopyWarning。我可以用.copy（）來擺脫它們，正如我在這裏展示的...我認爲這是有效的，因爲我只需要此函數的子集的臨時副本。

這看起來有效嗎？我可以用.is_copy = False壓縮並保存內存嗎？

是否有更接近這個問題的Pythonic方法，可以完全解決這個問題？

來源

2017-02-17 richarddb

你寫的是有效的。 pandas在這種情況下會拋出SettingsWithCopy警告，因爲它依賴於numpy數組語義，其中數據的效率爲，數據爲，而不是副本。 pandas不能本身檢測這將導致一個問題，因此它（保守）只是在好的情況下和錯誤的情況下拋出這個錯誤。

可以使用擺脫錯誤消息：

pd.options.mode.chained_assignment = None # default='warn'

欲瞭解更多詳情，請參閱How to deal with SettingWithCopyWarning in Pandas?

來源

2017-02-18 00:03:26

Python的大熊貓：無法撼動SettingWithCopyWarning錯誤，甚至是問心無愧的.loc使用

回答

相關問題