2016-10-12 72 views
2
import pandas as pd 

mydata = [{'ID' : '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016'}, 
      {'ID' : '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016'}] 

mydata2 = [{'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '10/10/2016'}, 
      {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '11/10/2016'}, 
      {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '12/10/2016'}, 
      {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '13/10/2016'}, 
      {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '14/10/2016'}, 
      {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '15/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '10/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '11/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '12/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '13/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '14/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '15/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '16/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '17/10/2016'}, 
      {'ID': '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016', 'Date': '18/10/2016'},] 

df = pd.DataFrame(mydata) 
df2 = pd.DataFrame(mydata2) 

我無法找到如何將'df'更改爲'df2'的答案。也許我沒有說得對。如何爲兩列之間的所有日期添加行?

我想在「入場日期」,「退出日期」兩欄中的日期之間取得所有日期,併爲每一列創建一行,在新列「日期」中爲每行輸入相應的日期。

任何幫助將不勝感激。

回答

2

可以使用melt重塑,set_index和刪除列variable

#convert columns to datetime 
df['Entry Date'] = pd.to_datetime(df['Entry Date']) 
df['Exit Date'] = pd.to_datetime(df['Exit Date']) 

df2 = pd.melt(df, id_vars='ID', value_name='Date') 
df2.Date = pd.to_datetime(df2.Date) 
df2.set_index('Date', inplace=True) 
df2.drop('variable', axis=1, inplace=True) 
print (df2) 
      ID 
Date   
2016-10-10 10 
2016-10-10 20 
2016-10-15 10 
2016-10-18 20 

然後groupbyresampleffill缺失值:

df3 = df2.groupby('ID').resample('D').ffill().reset_index(level=0, drop=True).reset_index() 
print (df3) 
     Date ID 
0 2016-10-10 10 
1 2016-10-11 10 
2 2016-10-12 10 
3 2016-10-13 10 
4 2016-10-14 10 
5 2016-10-15 10 
6 2016-10-10 20 
7 2016-10-11 20 
8 2016-10-12 20 
9 2016-10-13 20 
10 2016-10-14 20 
11 2016-10-15 20 
12 2016-10-16 20 
13 2016-10-17 20 
14 2016-10-18 20 

mergeDataFrame

print (pd.merge(df, df3)) 
    Entry Date Exit Date ID  Date 
0 2016-10-10 2016-10-15 10 2016-10-10 
1 2016-10-10 2016-10-15 10 2016-10-11 
2 2016-10-10 2016-10-15 10 2016-10-12 
3 2016-10-10 2016-10-15 10 2016-10-13 
4 2016-10-10 2016-10-15 10 2016-10-14 
5 2016-10-10 2016-10-15 10 2016-10-15 
6 2016-10-10 2016-10-18 20 2016-10-10 
7 2016-10-10 2016-10-18 20 2016-10-11 
8 2016-10-10 2016-10-18 20 2016-10-12 
9 2016-10-10 2016-10-18 20 2016-10-13 
10 2016-10-10 2016-10-18 20 2016-10-14 
11 2016-10-10 2016-10-18 20 2016-10-15 
12 2016-10-10 2016-10-18 20 2016-10-16 
13 2016-10-10 2016-10-18 20 2016-10-17 
14 2016-10-10 2016-10-18 20 2016-10-18 
+0

df2.groupby('ID')。resample('D').ffill().setup_index(level = 0,drop = True).reset_index()不幸返回任何值 –

+1

是的,也許需要熊貓版本'0.18。 1'或更新。 - [見](http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#groupby-syntax-with-window-and-resample-operations) – jezrael

相關問題