2016-07-11 59 views
4

我有一個包含約400,000行以下的收據雜誌數據幀: Snippet1熊貓創建列該副本的另一單元格值

其中包含了「招標」列(一個值,例如,現金,CrCd行)是收據總額。隨後的行是該事務中的項目。我想有效地匹配這些項目在收到總的行數的新列,得到以下: Snippet2

我能夠通過細胞O2設置爲=IF(N2="",O1,C2)和拖動在Excel中實現這一目標。理想情況下,我想避免使用Excel來操縱數據。

有沒有辦法在熊貓做到這一點,而不使用iterrows()itertuples()?這兩個都花了指數的時間來完成。

UPDATE: 這裏是數據幀進行測試的逗號分隔的文本:

Company Name,Str,Rcpt#,Rcpt Date,Time,Ext O P$,Disc %,Ext D$,Ext P$,Rcpt T$,Shipping w/T,Fee $ w/T,Rcpt Total,Tender 
,2,32381,4/5/2015,5:51p,1.96,0,0,1.96,0.04,0,0,2,Cash 
,2683,18924,VC,,Item_Desc,1,1.5,0,0.25,,,, 
,2713,505101,VC1,C12A,Item_desc,1,0.46,0,0.12,,,, 
,,32382,4/5/2015,6:01p,18.3,0,0,18.3,1.7,0,0,20,CrCd 
,3034,502201,AC,,Item_desc,1,9.15,0,3.36,,,, 
,3034,502201,AC5,,Item_desc,1,9.15,0,3.36,,,, 
,,32383,4/5/2015,6:08p,9.15,0,0,9.15,0.85,0,0,10,Cash 
,3034,502201,AC5,,Item_Desc,1,9.15,0,3.36,,,, 
,,32384,4/5/2015,6:13p,18.3,0,0,18.3,1.7,0,0,20,CrCd 
,2212,505201,GV,J25A,Item_desc,1,9.15,0,1.56,,,, 
,2212,505201,GV,J25A,Item_desc,1,9.15,0,1.56,,,, 
,,32385,4/5/2015,6:15p,4.5,0,0,4.5,0,0,0,4.5,Cash 
,4619,18924,VC,,Item_desc,1,4.5,0,0.5,,,, 
,,32386,4/5/2015,6:15p,4.5,0,0,4.5,0,0,0,4.5,Cash 
,4619,18924,VC,,Item_desc,1,4.5,0,0.5,,,, 
+0

請分享您的代碼 –

回答

3

UPDATE:

In [11]: df['ReceiptNumber'] = (df.assign(ReceiptNumber=np.where(pd.notnull(df.Tender), 
    ....:               df['Rcpt#'], 
    ....:               np.nan))['ReceiptNumber'] 
    ....:       .fillna(method='pad') 
    ....:       .astype(int)) 

In [12]: df[['Rcpt#','Tender','ReceiptNumber']] 
Out[12]: 
    Rcpt# Tender ReceiptNumber 
0 32381 Cash   32381 
1 18924 NaN   32381 
2 505101 NaN   32381 
3 32382 CrCd   32382 
4 502201 NaN   32382 
5 502201 NaN   32382 
6 32383 Cash   32383 
7 502201 NaN   32383 
8 32384 CrCd   32384 
9 505201 NaN   32384 
10 505201 NaN   32384 
11 32385 Cash   32385 
12 18924 NaN   32385 
13 32386 Cash   32386 
14 18924 NaN   32386 

OLD答案:

df.assign(ReceiptNumber=np.where(pd.notnull(df.Tender), 
           df['Rcpt#'], 
           np.nan))['ReceiptNumber'] 
    .fillna(method='pad') 

PS這個片段中並沒有進行測試,你沒有提供你的數據在文本形式設定,所以我不能複製&粘貼

+0

道歉@MaxU,逗號分隔版本的數據現在在那裏。我無法按原樣做這項工作。我只是在新專欄中獲得所有'NaN'。 –

+0

@ B-road95,請在我的回答中檢查UPDATE – MaxU

+0

偉大的作品。非常感謝@MaxU –