2014-09-27 21 views
0

我有一個在這裏發表類似的問題: Multiple data set plotting with matplotlib.pyplot.plot_date繪製在圖超過2系列(matplotlib plot_date())

,這爲我工作,但我想在繪製超過2個地塊同樣的數字。在我的情況下,例如,如果我調用plot_date()函數5次,結果圖顯示最後兩次調用的點/線,但前三次調用中,線不會被繪製,但所有5出現在圖例中(我在5次調用中分別用不同的顏色和標籤進行區分)。

概述我正在使用python,打開一個包含數據(系列標籤,日期,count(y))的csv文本文件到元組列表中,然後將這個列表放入一個pandas數據框中。接着我轉動它來將其更改爲

df = df.pivot(index='date', columns='series', values='count') 

然後我的代碼繪製:

fig = plt.figure() 
plt.plot_date(x=df.index, y=df['series1'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d1", color='red') 

plt.plot_date(x=df.index, y=df['series2'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d2", color='blue') 

plt.plot_date(x=df.index, y=df['series3'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d3", color='green') 

plt.plot_date(x=df.index, y=df_date_domain['series4'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d4", color='orange') 

plt.plot_date(x=df.index, y=df_date_domain['series5'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d5", color='black') 

fig.autofmt_xdate()  
plt.legend() 
plt.xlabel("Day") 
plt.ylabel("Count") 
plt.title("example of trying to plot more than 2 on the same figure") 
fname='test.pdf' 
plt.savefig(fname) 

下面是結果 Sample Plot

下面是完整的代碼,然後在文本輸入(python test_plot.py plot_test.csv

import sys 
import pandas as pd 
from ggplot import * 
import matplotlib.pyplot as plt 


def main(argv=sys.argv): 
    if len(sys.argv) != 2: 
     print sys.argv[0], "CSVinputFile (path if not in current dir)" 
     sys.exit(-2) 

inFileName = sys.argv[1] 
qname_list = [] 
print inFileName 


with open(inFileName, 'Ur') as fp: 
    data_list = [tuple(line.strip().split(",")) for line in fp] 


header_row=['series','date','count'] 
df = pd.DataFrame.from_records(data_list,columns=header_row) 
df['date'] = pd.to_datetime(df['date']) 

print df.head(10) 
df = df.pivot(index='date', columns='series', values='count') 

print df.head(10) 
print df.describe() 


#extract the columns out of the data to plot out 
series_2_extract = ['series1', 'series3', 'series2'] 
#d_data = df[[series_2_extract]] #doesnt work TypeError: unhashable type: 'list' 
d_data = df[['series1', 'series3', 'series2']] 
print d_data 


#below works, can use a loop to iterate the list and call plot_date for each item in the list, 
#but only last two series are showing on the plot 

fig = plt.figure() 
plt.plot_date(x=df.index, y=df['series1'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d1", color='red') 

plt.plot_date(x=df.index, y=df['series2'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d2", color='blue') 

plt.plot_date(x=df.index, y=df['series3'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d3", color='green') 

plt.plot_date(x=df.index, y=df['series4'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d4", color='orange') 

plt.plot_date(x=df.index, y=df['series5'], fmt='bo-', tz=None, xdate=True, 
     ydate=False, label="d5", color='black') 

fig.autofmt_xdate()  
plt.legend() 
plt.xlabel("Day") 
plt.ylabel("Count") 
plt.title("example of trying to plot more than 2 on the same figure") 
fname='test.pdf' 
plt.savefig(fname) 

return 0 

if __name__ == '__main__': 
    sys.exit(main()) 

由於文字輸入是漫長的,我在這裏有它的引擎收錄 http://pastebin.com/hmCUabvu 和上面的代碼也是在這裏引擎收錄:http://pastebin.com/07TNYie4

+0

您能否提供一個可運行的示例來演示問題的示例數據? – BrenBarn 2014-09-27 05:20:48

+0

@BrenBarn - 我添加了代碼,加上pastebin鏈接到數據輸入和相同的代碼。謝謝 – KBA 2014-09-27 16:59:42

回答

1

因爲數據是一樣的。你的線條被繪製在彼此之上。

>>> np.all(df['series1'] == df['series5']) 
True 
>>> np.all(df['series1'] == df['series3']) 
True 
>>> np.all(df['series2'] == df['series4']) 
True 
+0

非常感謝!我正在絞盡腦汁......我改變了數據編號,現在我確實看到它們全都被繪製出來了:)如果它能幫助任何人,那麼我就可以遍歷列表或索引列表,例如'series_2_extract'在我的代碼中,用於繪圖日期函數'y = df [series_2_extract [0]]' – KBA 2014-09-28 19:52:30