熊貓數據框中：如何根據該行中的值在其他一些列複製

你好我處理一個數據幀象下面這樣：熊貓數據框中：如何根據該行中的值在其他一些列複製

yearStart 2014 2015 2016 2017 2018 2019 
0 2015  0 150 200  0  0  0  
1 2016  0  0 200 140 35 10  
2 2017  0  0  0  20 12 12

通常情況下，它是一個財務報告的所有費用，begining時在幾年

yearStart Year+0 Year+1 Year+2 Year+3 Year+4 ... Year+N 
0  2015 150  200  0  0  0  
1  2016 200  140  35  0  0  
2  2017 20  12  12  0  0

如何重塑數據幀，以具有存儲在一個相對日期樣式從合同的第一年DATAS合同簽訂（列「年開始」）和持久。

我試圖通過iterrows（）每行，而在另一個數據幀複製有關列，但它需要太多的時間...

編輯：

嗯，我忘了，也許一年的說合同的有關期限，價值爲0，不應該被遺忘。要考慮的列是在yearStart和end之間的日期，作爲參數給出。輸入更像是這樣的：

0 2015  0 150 200  0 13  0  
    1 2016  0  0 200 140 35  0 10 
    2 2017  0  0  0  20 12  0 12

謝謝

來源

2017-09-18 phil

我通過你的修改相關更改答案。 – jezrael

關於我更新的問題，它完美的工作（128秒重塑200 000 * 4的數據幀）我用系列的掩碼語法來製作它（感謝Zero和Jarad）。一個有趣的事實是：當我用apply（）方法檢查每一行的'print（）'時，我注意到第一行顯示兩次，即使對最終結果沒有影響 – phil

如果我或其他答案是有幫助的，不要忘記[接受]（http://meta.stackexchange.com/a/5235/295067）它 - 點擊答案旁邊的複選標記（'✓'）將其從灰色變爲填充。謝謝。 – jezrael

df=df.replace({0:np.nan}) 
df=df.loc[:,df.isnull().sum(0).ne(3)]

選項1：

df.apply(lambda x : (x[x.notnull()].values.tolist()+x[x.isnull()].values.tolist()),1).fillna(0)

出[145]：

yearStart 2015 2016 2017 2018 2019 
0  2015.0 150.0 200.0 0.0 0.0 0.0 
1  2016.0 200.0 140.0 35.0 10.0 0.0 
2  2017.0 20.0 12.0 12.0 0.0 0.0

選項2：

df.apply(lambda x: sorted(x, key=pd.isnull), 1).fillna(0) 


Out[145]: 
    yearStart 2015 2016 2017 2018 2019 
0  2015.0 150.0 200.0 0.0 0.0 0.0 
1  2016.0 200.0 140.0 35.0 10.0 0.0 
2  2017.0 20.0 12.0 12.0 0.0 0.0

來源

2017-09-18 16:07:26 Wen

與apply創建新行與過濾，然後分配新的列名

df1 = df.apply(lambda x: pd.Series(x[x!=0].values), 1).fillna(0).astype(int) 
df1.columns = df.columns.tolist()[:len(df1.columns)] 
df1 = df1.reindex(columns=df.columns, fill_value=0) 
print (df1) 
    yearStart 2014 2015 2016 2017 2018 2019 
0  2015 150 200  0  0  0  0 
1  2016 200 140 35 10  0  0 
2  2017 20 12 12  0  0  0

如果更大的數據幀是可能使用Divakar功能justify_rows：

def justify_rows(a, side='left'): 
    mask = a>0 
    justified_mask = np.sort(mask,1) 
    if side=='left': 
     justified_mask = justified_mask[:,::-1] 
    out = np.zeros_like(a) 
    out[justified_mask] = a[mask] 
    return out 

df1 = pd.DataFrame(justify_rows(df.values), columns=df.columns, index=df.index) 
print (df1) 
    yearStart 2014 2015 2016 2017 2018 2019 
0  2015 150 200  0  0  0  0 
1  2016 200 140 35 10  0  0 
2  2017 20 12 12  0  0  0

如果w螞蟻字符串Years：

cols = ['yearStart'] + ['Year+{}'.format(x) for x in range(len(df.columns) - 1)] 
df1 = pd.DataFrame(justify_rows(df.values), columns=cols, index=df.index) 
print (df1) 
    yearStart Year+0 Year+1 Year+2 Year+3 Year+4 Year+5 
0  2015  150  200  0  0  0  0 
1  2016  200  140  35  10  0  0 
2  2017  20  12  12  0  0  0

編輯：

對於第二溶液需要this solution用於選擇第一連續0：

def justify_rows(a, side='left'): 
    mask = a.cumsum(axis=1) != 0 
    print (mask) 
    justified_mask = np.sort(mask,1) 
    print (justified_mask) 
    if side=='left': 
     justified_mask = justified_mask[:,::-1] 
    out = np.zeros_like(a) 
    out[justified_mask] = a[mask] 
    print (out) 
    return out 

cols = ['Year+{}'.format(x) for x in range(len(df.columns) - 1)] 
df1 = df[['yearStart']].join(pd.DataFrame(justify_rows(df.values[:, 1:]), 
              columns=cols, index=df.index)) 
print (df1) 
    yearStart Year+0 Year+1 Year+2 Year+3 Year+4 Year+5 
0  2015  150  200  0  13  0  0 
1  2016  200  140  35  0  0  0 
2  2017  20  12  0  0  0  0

來源

2017-09-18 16:05:30 jezrael

當你提到'Divakar函數justify_rows'時，你毫無疑問贏了:) – Wen

熊貓數據框中：如何根據該行中的值在其他一些列複製

回答

相關問題