從csv文件使用matplotlib和pandas繪製直方圖

-7

我的csv文件非常複雜..它包含數字以及字符串屬性。這是我的CSV文件看起來像我要繪製的過程柱狀圖對比的CPUID從csv文件使用matplotlib和pandas繪製直方圖

2016-03-17 parool singh

你可以展示你試過的東西嗎？您沒有發佈加載數據的代碼或嘗試了任何有據可查的[繪圖方法]（http://pandas.pydata.org/pandas-docs/stable/api.html#api-dataframe-plotting） – EdChum

使用 ['read_csv（）'function]（http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html）將csv讀取爲pandas DataFrame。 – Sevanteri

如果在第二列中只有'cpu_id'，那麼僅僅給列標題「cpu_id」並刪除（搜索/替換）除了值0/1之外的字段中的所有內容是否合理？ – jDo

您可以通過hist使用read_csv，indexing with str和情節：

import pandas as pd 
import matplotlib.pyplot as plt 
import io 

temp=u"""kmem_kmalloc;{cpu_id=1} 
kmem_kmalloc;{cpu_id=1} 
kmem_kmalloc;{cpu_id=1} 
kmem_kmalloc;{cpu_id=1} 
kmem_kfree;{cpu_id=1} 
kmem_kfree;{cpu_id=1} 
power_cpu_idle;{cpu_id=0} 
power_cpu_idle;{cpu_id=0} 
power_cpu_idle;{cpu_id=3}""" 

s = pd.read_csv(io.StringIO(temp), #after testing replace io.StringIO(temp) to filename 
       sep=";", #set separator, if sep=',' can be omited (default sep = ,) 
       header=None, #no header in csv 
       names=[None,'cpuid'], #set names of columns, (first is None because index) 
       index_col=0, #first column set to index 
       squeeze=True) #try convert DataFrame to Series 
print s 
kmem_kmalloc  {cpu_id=1} 
kmem_kmalloc  {cpu_id=1} 
kmem_kmalloc  {cpu_id=1} 
kmem_kmalloc  {cpu_id=1} 
kmem_kfree  {cpu_id=1} 
kmem_kfree  {cpu_id=1} 
power_cpu_idle {cpu_id=0} 
power_cpu_idle {cpu_id=0} 
power_cpu_idle {cpu_id=3} 
Name: cpuid, dtype: object

#if max cpu <= 9, use Indexing with .str 
s = s.str[-2].astype(int) 

#if cpu > 9 
#s= s.str.extract('(\d)', expand=False) 
print s 
kmem_kmalloc  1 
kmem_kmalloc  1 
kmem_kmalloc  1 
kmem_kmalloc  1 
kmem_kfree  1 
kmem_kfree  1 
power_cpu_idle 0 
power_cpu_idle 0 
power_cpu_idle 3 
Name: cpuid, dtype: int32 

plt.figure(); 
s.hist(alpha=0.5) 
plt.show()

來源

2016-03-17 11:58:22 jezrael

從csv文件使用matplotlib和pandas繪製直方圖

回答

相關問題