2017-03-03 70 views
4

我正試圖使用​​data_reader從Yahoo Finance下載和組織數據。這個過程很簡單:如何使用迭代追加到多級熊貓數據框?

對於每一隻股票我做到以下幾點:

aapl = data.DataReader('AAPL', 'yahoo', '2004-01-01') 
del aapl['Close'] 
aapl.rename(columns={'Adj Close': 'Close'}, inplace=True) 

gs = data.DataReader('GS', 'yahoo', '2004-01-01') 
del gs['Close'] 
gs.rename(columns={'Adj Close': 'Close'}, inplace=True) 

然後是這樣的:

aapl.columns = pd.MultiIndex.from_product([['aapl'], aapl.columns]) 
gs.columns = pd.MultiIndex.from_product([['gs'], gs.columns]) 

最後把它們放在一起:

data = pd.concat([aapl, gs], axis = 1) 

如何我可以這樣做,因此使用for循環對於100多個代號的列表是有效的:

結構是這樣的:

stocks = ['AAPL', 'GS'] 

for i in stocks: 
    i = data.DataReader(i, 'yahoo', '2004-01-01') 
    del i['Close'] 
    i.rename(columns={'Adj Close': 'Close'}, inplace=True) 
    i.columns = pd.MultiIndex.from_product([['i'], i.columns]) 
    # append to df 

虛擬實例所需的輸出將是:

df.head() 

      aapl gs 
      Close Close 
Date   
2004-01-02 1.38 83.58 
2004-01-05 1.44 83.63 
2004-01-06 1.43 83.13 
2004-01-07 1.46 84.87 
2004-01-08 1.51 84.98 

回答

3

我會使用Pandas.Panel:

In [69]: from pandas_datareader import data 

In [70]: stocks = ['AAPL', 'GS'] 

閱讀所有股票成熊貓的財務數據。一步到位:

In [71]: p = data.DataReader(stocks, 'yahoo', '2004-01-01') 

In [72]: p.axes 
Out[72]: 
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'), 
DatetimeIndex(['2004-01-02', '2004-01-05', '2004-01-06', '2004-01-07', '2004-01-08', '2004-01-09', '2004-01-12', '2004-01-13', '2004-01-14' 
, '2004-01-15', 
       ... 
       '2017-02-16', '2017-02-17', '2017-02-21', '2017-02-22', '2017-02-23', '2017-02-24', '2017-02-27', '2017-02-28', '2017-03-01' 
, '2017-03-02'], 
       dtype='datetime64[ns]', name='Date', length=3314, freq=None), 
Index(['AAPL', 'GS'], dtype='object')] 

現在你可以切片此面板這樣的:

In [73]: p.loc['Adj Close'] 
Out[73]: 
        AAPL   GS 
Date 
2004-01-02 1.378514 83.582711 
2004-01-05 1.436168 83.625740 
2004-01-06 1.430985 83.126634 
2004-01-07 1.463375 84.873497 
2004-01-08 1.513256 84.976763 
2004-01-09 1.489935 83.901107 
2004-01-12 1.537224 84.142053 
2004-01-13 1.562488 84.047395 
2004-01-14 1.567671 85.518890 
...    ...   ... 

UPDATE:轉換板,以多級數據幀:

多指標數據框:

In [80]: p.to_frame() 
Out[80]: 
         Open  High   Low  Close  Volume Adj Close 
Date  minor 
2004-01-02 AAPL 21.549999 21.750000 21.180001 21.280000 36160600.0 1.378514 
      GS  98.800003 99.089996 96.580002 97.129997 3042300.0 83.582711 
2004-01-05 AAPL 21.420000 22.390000 21.420000 22.170000 98754600.0 1.436168 
      GS  97.300003 97.940002 96.150002 97.180000 4817700.0 83.625740 
2004-01-06 AAPL 22.250000 22.420001 21.710000 22.090000 127337000.0 1.430985 
      GS  97.360001 97.669998 96.379997 96.599998 4077800.0 83.126634 
2004-01-07 AAPL 22.100000 22.830000 21.930000 22.590000 146718600.0 1.463375 
      GS  96.760002 98.860001 96.449997 98.629997 4457800.0 84.873497 
2004-01-08 AAPL 22.840000 23.730001 22.649999 23.360001 115075800.0 1.513256 
      GS  98.730003 98.980003 97.699997 98.750000 3687800.0 84.976763 
...      ...   ...   ...   ...   ...   ... 
2017-02-24 AAPL 135.910004 136.660004 135.279999 136.660004 21690900.0 136.660004 
      GS  247.699997 248.880005 246.100006 247.350006 3565400.0 246.705168 
2017-02-27 AAPL 137.139999 137.440002 136.279999 136.929993 20196400.0 136.929993 
      GS  247.210007 249.759995 246.610001 249.330002 2372600.0 248.680002 
2017-02-28 AAPL 137.080002 137.440002 136.699997 136.990005 23403500.0 136.990005 
      GS  248.000000 249.000000 245.610001 248.059998 3627100.0 248.059998 
2017-03-01 AAPL 137.889999 140.149994 137.600006 139.789993 36272400.0 139.789993 
      GS  253.710007 255.149994 251.259995 252.710007 5218300.0 252.710007 
2017-03-02 AAPL 140.000000 140.279999 138.759995 138.960007 26153300.0 138.960007 
      GS  253.520004 254.240005 250.970001 251.059998 3014300.0 251.059998 

[6628 rows x 6 columns] 

多列數據框:

In [81]: p.to_frame().unstack() 
Out[81]: 
        Open     High      Low     Close    \ 
minor    AAPL   GS  AAPL   GS  AAPL   GS  AAPL   GS 
Date 
2004-01-02 21.549999 98.800003 21.750000 99.089996 21.180001 96.580002 21.280000 97.129997 
2004-01-05 21.420000 97.300003 22.390000 97.940002 21.420000 96.150002 22.170000 97.180000 
2004-01-06 22.250000 97.360001 22.420001 97.669998 21.710000 96.379997 22.090000 96.599998 
2004-01-07 22.100000 96.760002 22.830000 98.860001 21.930000 96.449997 22.590000 98.629997 
2004-01-08 22.840000 98.730003 23.730001 98.980003 22.649999 97.699997 23.360001 98.750000 
2004-01-09 23.229999 98.739998 24.130000 98.750000 22.789999 97.290001 23.000001 97.500000 
2004-01-12 23.250000 97.599998 24.000000 97.849998 23.100000 96.449997 23.730001 97.779999 
2004-01-13 24.700000 97.849998 24.839999 97.949997 23.860000 97.040001 24.120000 97.669998 
2004-01-14 24.399999 97.500000 24.539999 99.500000 23.780000 97.459999 24.200000 99.379997 
2004-01-15 22.910000 100.400002 23.400000 102.000000 22.499999 99.949997 22.850001 101.139999 
...    ...   ...   ...   ...   ...   ...   ...   ... 
2017-02-16 135.669998 250.300003 135.899994 250.779999 134.839996 248.440002 135.350006 249.440002 
2017-02-17 135.100006 247.509995 135.830002 250.559998 135.100006 247.110001 135.720001 250.380005 
2017-02-21 136.229996 251.000000 136.750000 252.649994 135.979996 250.710007 136.699997 251.759995 
2017-02-22 136.429993 250.059998 137.119995 252.350006 136.110001 250.000000 137.110001 251.729996 
2017-02-23 137.380005 251.309998 137.479996 251.899994 136.300003 249.320007 136.529999 251.190002 
2017-02-24 135.910004 247.699997 136.660004 248.880005 135.279999 246.100006 136.660004 247.350006 
2017-02-27 137.139999 247.210007 137.440002 249.759995 136.279999 246.610001 136.929993 249.330002 
2017-02-28 137.080002 248.000000 137.440002 249.000000 136.699997 245.610001 136.990005 248.059998 
2017-03-01 137.889999 253.710007 140.149994 255.149994 137.600006 251.259995 139.789993 252.710007 
2017-03-02 140.000000 253.520004 140.279999 254.240005 138.759995 250.970001 138.960007 251.059998 

你也可以多級的列,如果你想進行排序:

In [96]: p.to_frame().unstack().swaplevel(axis=1).sort_index(1) 
Out[96]: 
minor    AAPL                  GS    \ 
      Adj Close  Close  High   Low  Open  Volume Adj Close  Close 
Date 
2004-01-02 1.378514 21.280000 21.750000 21.180001 21.549999 36160600.0 83.582711 97.129997 
2004-01-05 1.436168 22.170000 22.390000 21.420000 21.420000 98754600.0 83.625740 97.180000 
2004-01-06 1.430985 22.090000 22.420001 21.710000 22.250000 127337000.0 83.126634 96.599998 
2004-01-07 1.463375 22.590000 22.830000 21.930000 22.100000 146718600.0 84.873497 98.629997 
2004-01-08 1.513256 23.360001 23.730001 22.649999 22.840000 115075800.0 84.976763 98.750000 
2004-01-09 1.489935 23.000001 24.130000 22.789999 23.229999 106864800.0 83.901107 97.500000 
2004-01-12 1.537224 23.730001 24.000000 23.100000 23.250000 121886800.0 84.142053 97.779999 
2004-01-13 1.562488 24.120000 24.839999 23.860000 24.700000 169754200.0 84.047395 97.669998 
2004-01-14 1.567671 24.200000 24.539999 23.780000 24.399999 155010800.0 85.518890 99.379997 
2004-01-15 1.480218 22.850001 23.400000 22.499999 22.910000 254552200.0 87.033415 101.139999 
...    ...   ...   ...   ...   ...   ...   ...   ... 
2017-02-16 135.350006 135.350006 135.899994 134.839996 135.669998 22118000.0 248.789715 249.440002 
2017-02-17 135.720001 135.720001 135.830002 135.100006 135.100006 22084500.0 249.727267 250.380005 
2017-02-21 136.699997 136.699997 136.750000 135.979996 136.229996 24265100.0 251.103659 251.759995 
2017-02-22 137.110001 137.110001 137.119995 136.110001 136.429993 20745300.0 251.073739 251.729996 
2017-02-23 136.529999 136.529999 137.479996 136.300003 137.380005 20704100.0 250.535153 251.190002 
2017-02-24 136.660004 136.660004 136.660004 135.279999 135.910004 21690900.0 246.705168 247.350006 
2017-02-27 136.929993 136.929993 137.440002 136.279999 137.139999 20196400.0 248.680002 249.330002 
2017-02-28 136.990005 136.990005 137.440002 136.699997 137.080002 23403500.0 248.059998 248.059998 
2017-03-01 139.789993 139.789993 140.149994 137.600006 137.889999 36272400.0 252.710007 252.710007 
2017-03-02 138.960007 138.960007 140.279999 138.759995 140.000000 26153300.0 251.059998 251.059998 
+0

'Panel's是你最喜歡的;) – jezrael

+0

@jezrael,是的,我愛他們;-) – MaxU

1

這將下載每組股票定價數據又將和pd.concat()數據轉化爲單一的pandas.DataFrame

代碼:

stocks = ['AAPL', 'GS'] 
data = None 
for stock_name in stocks: 
    # fetch the price data 
    stock_data = data.DataReader(stock_name, 'yahoo', '2004-01-01') 

    # remove the closing price 
    del stock_data['Close'] 

    # rename Adjusted Close to Close 
    stock_data.rename(columns={'Adj Close': 'Close'}, inplace=True) 

    # Add a multi index for the stock name 
    stock_data.columns = pd.MultiIndex.from_product(
     [[stock_name], stock_data.columns]) 

    # concat this stock to the previous stocks 
    if data is None: 
     data = stock_data 
    else: 
     data = pd.concat([data, stock_data], axis=1) 

結果:

    AAPL             \ 
        Open  High   Low  Volume  Close 
Date                  
2004-01-02 21.549999 21.750000 21.180001 36160600 1.378514 
2004-01-05 21.420000 22.390000 21.420000 98754600 1.436168 
2004-01-06 22.250000 22.420001 21.710000 127337000 1.430985 
...    ...   ...   ...  ...   ... 
2017-02-28 137.080002 137.440002 136.699997 23403500 136.990005 
2017-03-01 137.889999 140.149994 137.600006 36272400 139.789993 
2017-03-02 140.000000 140.279999 138.759995 26153300 138.960007 

        GS            
        Open  High   Low Volume  Close 
Date                 
2004-01-02 98.800003 99.089996 96.580002 3042300 83.582711 
2004-01-05 97.300003 97.940002 96.150002 4817700 83.625740 
2004-01-06 97.360001 97.669998 96.379997 4077800 83.126634 
...    ...   ...   ...  ...   ... 
2017-02-28 248.000000 249.000000 245.610001 3627100 248.059998 
2017-03-01 253.710007 255.149994 251.259995 5218300 252.710007 
2017-03-02 253.520004 254.240005 250.970001 3014300 251.059998