提取數據，並創建新的表

我試圖在大熊貓建立一個2×24表與下面的以下數據：提取數據，並創建新的表

d.iloc[0:2] = [[0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L], [0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 2L, 2L, 0L, 0L, 0L]]

基本上，在第一子托架表示24小時一月的一天數據和二月的第二個小括號。我期待結構2X24表（沒有「L」）以下列方式：

1 2 3 4 5 6 7 8 9 10 11 12 ... 24 
Jan 0 0 0 0 0 0 0 0 0 1 1 1 ... 0 
Feb 0 0 0 0 0 0 0 0 0 1 1 1 ... 0

我發現了什麼挑戰性的剝離（.strip），分割，並將數據複製到一個新的數據幀結構。我經常在網上的數據框中找到12個子括號（每月一個）的原始結構。我包含d.iloc[0,2]，因爲我打算使用for循環將函數應用於第2列中的所有元素。謝謝你寶貴的幫助。

來源

2016-09-18 John12

我認爲你可以使用DataFrame.from_records與應用str.strip：由dt.strftime有個名字產生

import pandas as pd 
import numpy as np 

a = [['0L', '0L', '0L', '0L', '0L', '0L', '0L', '0L', '0L', '1L', '1L', '1L', '1L', '1L', '0L', '0L', '0L', '1L', '1L', '1L', '1L', '0L', '0L', '0L'], 
    ['0L', '0L', '0L', '0L', '0L', '0L', '0L', '0L', '0L', '1L', '1L', '1L', '1L', '1L', '0L', '0L', '0L', '1L', '1L', '2L', '2L', '0L', '0L', '0L']] 

idx = ['Jan','Feb'] 
df = pd.DataFrame.from_records(a, index=idx).apply(lambda x: x.str.strip('L').astype(int)) 
print (df) 
    0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 \ 
Jan 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 1 1 
Feb 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 2 2 

    21 22 23 
Jan 0 0 0 
Feb 0 0 0 

[2 rows x 24 columns]

更通用的解決方案：

print (pd.Series(range(1,len(a) + 1))) 
0 1 
1 2 
dtype: int32 

idx = pd.to_datetime(pd.Series(range(1,len(a) + 1)), format='%m').dt.strftime('%b') 
0 Jan 
1 Feb 
dtype: object 

df = pd.DataFrame.from_records(a, index=idx).apply(lambda x: x.str.strip('L').astype(int)) 
print (df) 
    0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 \ 
Jan 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 1 1 
Feb 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 2 2 

    21 22 23 
Jan 0 0 0 
Feb 0 0 0

如果需要split值第一：

b = [['0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L'], 
    ['0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 2L, 2L, 0L, 0L, 0L']] 

idx = pd.to_datetime(pd.Series(range(1,len(a) + 1)), format='%m').dt.strftime('%b') 

df1 = pd.DataFrame.from_records(b, index=idx) 
     .iloc[:,0] 
     .str.split(', ', expand=True) 
     .replace({'L':''}, regex=True) 
     .astype(int) 
print (df1) 

    0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 \ 
Jan 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 1 1 
Feb 0 0 0 0 0 0 0 0 0 1 ... 0 0 0 1 1 2 2 

    21 22 23 
Jan 0 0 0 
Feb 0 0 0 

[2 rows x 24 columns]

來源

2016-09-18 12:47:28 jezrael

它是如何工作的？ – jezrael

謝謝你的詳細解答。我不能將.str.strip（'L'）。astype（int）應用於數據框的單元格：AttributeError：'str'對象沒有屬性'str'。怎麼來的？（單元格是str類型） – John12

它不能與樣本或實際數據一起使用？ – jezrael

提取數據，並創建新的表

回答

相關問題