2017-04-15 40 views
5

在Jupyter與熊貓DataSeries工作,我有行這樣的數據集:用於如何從熊貓DataSeries中提取獨特的排列?

color: white 
engineType: diesel 
make: Ford 
manufacturingYear: 2004 
accidentCount: 123 

我需要做的是生產年份(X軸)繪製事故數(Y軸)的圖表所有顏色/引擎類型/製造的排列。任何想法如何繼續這個?使用groupby()

import numpy as np 
import pandas as pd 
from pandas import DataFrame, Series 
import random 


colors = ['white', 'black','silver'] 
engineTypes = ['diesel', 'petrol'] 
makes = ['ford', 'mazda', 'subaru'] 
years = range(2000,2005) 

rowCount = 100 

def randomEl(data): 
    rand_items = [data[random.randrange(len(data))] for item in range(rowCount)] 
    return rand_items 


df = DataFrame({ 
    'color': Series(randomEl(colors)), 
    'engineType': Series(randomEl(engineTypes)), 
    'make': Series(randomEl(makes)), 
    'year': Series(randomEl(years)), 
    'accidents': Series([int(1000*random.random()) for i in range(rowCount)]) 
}) 

回答

6

您可以通過獨特的colorengineType得到事故起數和make組合:

爲了加快速度我有這個初始設置

accident_counts = df.groupby(['color', 'engineType', 'make'])['accidents'].sum() 

Matplotlib是一個繪製結果的方式:

import matplotlib.pyplot as plt 
accident_counts.plot(kind='bar') 
plt.show() 
+0

好的答案。驚人的熊貓爲你做了多少工作。 – Chuck