繪製一個尷尬的大熊貓多指數數據幀

我有一個非常尷尬的數據幀，看起來像這樣：繪製一個尷尬的大熊貓多指數數據幀

+----+------+-------+-------+--------+----+--------+ 
| |  | hour1 | hour2 | hour 3 | … | hour24 | 
+----+------+-------+-------+--------+----+--------+ 
| id | date |  |  |  | |  | 
| 1 | 3 |  4 |  0 |  96 | 88 |  35 | 
| | 4 | 10 |  2 |  54 | 42 |  37 | 
| | 5 |  9 | 32 |  8 | 70 |  34 | 
| | 6 | 36 | 89 |  69 | 46 |  78 | 
| 2 | 5 | 17 | 41 |  48 | 45 |  71 | 
| | 6 | 50 | 66 |  82 | 72 |  59 | 
| | 7 | 14 | 24 |  55 | 20 |  89 | 
| | 8 | 76 | 36 |  13 | 14 |  21 | 
| 3 | 5 | 97 | 19 |  41 | 61 |  72 | 
| | 6 | 22 |  4 |  56 | 82 |  15 | 
| | 7 | 17 | 57 |  30 | 63 |  88 | 
| | 8 | 83 | 43 |  35 | 8 |  4 | 
+----+------+-------+-------+--------+----+--------+

對於每個id存在的dates列表，併爲每個date小時列是整天的價值數據在整個24小時內按小時分解。

我想要做的是繪製（使用matplotlib）每個ids的完整小時數據，但我想不出一種方法來做到這一點。我正在研究創建numpy矩陣的可能性，但我不確定這是否是正確的路徑。

澄清：基本上，對於每個ID我想將所有小時數據按順序連接在一起並繪製。我已經有了適當的日子，所以我想這只是一個問題，找到一種方法將每個id的所有每小時數據放入一個對象中

有關如何最好地完成此任務的任何想法？

這裏是CSV格式一些示例數據：http://www.sharecsv.com/s/e56364930ddb3d04dec6994904b05cc6/test1.csv

來源

2015-06-14 metersk

你怎麼想情節呢？你是說你想將DataFrame的每一行作爲單獨的一行來繪製，並將所有這些行組合在一個圖中？ – BrenBarn

@BrenBarn本質上，對於每個ID，我想將所有小時數據按順序連接在一起並繪製出來。我已經有了適當的日子，所以我想這只是一個問題，找到一種方法將每個id的所有每小時數據放入一個對象中。 – metersk

再次，請說出你的意思是「繪製」。繪製它*如何*？吧情節？線情節？每個欄/行代表什麼？如果有的話，這些裸露/線條如何組合成一張圖？你的意思是說，例如，對於id = 1，你會得到96分（因爲它有四個日期，每個24分）？ – BrenBarn

下面是一個方法：

for groupID, data in d.groupby(level='id'): 
    fig = pyplot.figure() 
    ax = fig.gca() 
    ax.plot(data.values.ravel()) 
    ax.set_xticks(np.arange(len(data))*24) 
    ax.set_xticklabels(data.index.get_level_values('date'))

ravel是一個numpy的方法，該方法將多個行串出成一個長一維數組。

當心在大型數據集上交互式地運行它，因爲它會爲每一行創建一個單獨的圖。如果要保存繪圖等，請設置一個非交互式matplotlib後端，並使用savefig來保存每個圖形，然後在創建下一個圖形之前將其關閉。

來源

2015-06-14 19:08:48 BrenBarn

堆疊數據框可能會讓您感興趣，因此您可以將日期和時間放在同一個索引中。例如，做

df = df.stack().unstack(0)

將日期和時間放在索引和id作爲列名稱。調用df.plot()將爲您提供同一軸上每個時間序列的線圖。所以，你可以做到這一點作爲

ax = df.stack().unstack(0).plot()

，要麼通過傳遞參數給plot方法或通過調用ax方法格式化軸。

來源

2015-06-14 19:52:21 JoeCondron

太棒了，謝謝你。 – metersk

不客氣。我認爲它解決了'尷尬的數據框'問題 – JoeCondron

我對這個解決方案並不滿意，但也許它可以作爲出發點。由於你的數據是循環的，我選擇了一個極座標圖。不幸的是，y方向上的分辨率很差。因此，我手動縮放成積：

import pandas as pd 
import numpy as np 
from matplotlib import pyplot as plt 

df = pd.read_csv('test1.csv') 
df_new = df.set_index(['id','date']) 
n = len(df_new.columns) 

# convert from hours to rad 
angle = np.linspace(0,2*np.pi,n) 


# color palete to cycle through 
n_data = len(df_new.T.columns) 
color = plt.cm.Paired(np.linspace(0,1,n_data/2)) # divided by two since you have 'red', and 'blue' 
from itertools import cycle 
c_iter = cycle(color) 

fig = plt.figure() 
ax = fig.add_subplot(111, polar=True) 

# looping through the columns and manually select one category 
for ind, i in enumerate(df_new.T.columns): 
    if i[0] == 'red': 
     ax.plot(angle,df_new.T[i].values,color=c_iter.next(),label=i,linewidth=2) 


# set the labels 
ax.set_xticks(np.linspace(0, 2*np.pi, 24, endpoint=False)) 
ax.set_xticklabels(range(24)) 

# make the legend 
ax.legend(loc='upper left', bbox_to_anchor = (1.2,1.1)) 
plt.show()

放大0：

enter image description here

變焦1： enter image description here

放大2： enter image description here

來源

2015-06-14 21:13:35 Moritz

這與我所尋找的有很大不同，但這仍然真的非常棒。 – metersk

小心如果你複製粘貼代碼，我只是刪除了y-log-scale – Moritz

我很好奇，這些數據代表什麼？ – Moritz

繪製一個尷尬的大熊貓多指數數據幀

回答

相關問題