情節Seaborn Barplots在與Python

次要情節，我有兩個文件作爲輸入，看起來如下：情節Seaborn Barplots在與Python

col1 col2 
A B 
C C 
B A 
A A 
A C 
C A 
B B

含義，我有兩列用字母，用空格隔開。我想繪製這些字母出現的次數，每個字段在它自己的barplot中。假設這兩個文件具有不同的字母分佈。

這是代碼：

from collections import Counter 
from os.path import isfile, join 
from os import listdir 
import matplotlib.pyplot as plt 

import seaborn as sns 
sns.set(color_codes=True) 

inputDir = "/tmp/files/" 

inputFiles = [ f for f in listdir(inputDir) if isfile(join(inputDir, f)) ] 

fig, axes = plt.subplots(figsize=(6,6), ncols=2, nrows=len(inputFiles)) 

z=0 

while inputFiles: 

    files = inputFiles[0] 
    inputFiles.remove(files) 

    c = Counter() 
    a = Counter() 

    x1 = [] 
    y1 = [] 
    x2 = [] 
    y2 = [] 

    with open(inputDir + files, "r") as f2: 
    for line in f2: 
     line = line.strip() 
     if line.split(" ")[0] != "col1": 
     c[str(line.split(" ")[0])] += 1 
     a[str(line.split(" ")[1])] += 1 

    try: 
    for cc in c: 
     x1.append(cc) 
     y1.append(c[cc]) 
    row = z // 2 
    col = z % 2 
    ax_curr = axes[row, col] 
    sns.barplot(x1, y1, ax=ax_curr) 

    z+=1 

    for aa in a: 
     x2.append(aa) 
     y2.append(a[aa]) 
    row = z // 2 
    col = z % 2 
    ax_curr = axes[row, col] 
    sns.barplot(x2, y2, ax=ax_curr) 

    z+=1 

    except: 
    continue 

sns.plt.show()

結果應該是一個圖像，其中，I具有以下barplots作爲次要情節：

--------------------------------------- 
|     |     | 
|     |     | 
| barplot col1 | barplot col2 | 
|  file1  |  file1  | 
|     |     | 
--------------------------------------| 
|     |     | 
|     |     | 
| barplot col1 | barplot col2 | 
|  file2  |  file2  | 
|     |     | 
---------------------------------------

因此，每個條的高度應當對應於數的每封信。

直到現在的問題是，在每個子圖中的酒吧看起來完全不同，我無法找出原因。請讓我知道，如果我可以提供更多信息。

來源

2017-02-12 JJ Abrams

儘管目前還不清楚這裏「完全」不同的含義，但可能需要在拆分它們之前剝離線條。否則，最後一列的值可能看起來像"B "而不是"B"。另外我不確定你爲什麼試圖在c[int(line.split(" ")[0])] += 1中將字符串轉換爲整數。這對我來說沒有多大意義。

嘗試：

with open(inputDir + files, "r") as f2: 
     for line in f2: 
      line = line.strip() 
      if line.split(" ")[0] != "col1": 
       c[line.split(" ")[0]] += 1 
       a[line.split(" ")[1]] += 1

來源

2017-02-13 08:42:20 ImportanceOfBeingErnest

你完全右我有不同的數據，並調整了在夜間的新數據文件的代碼，而無需運行它們。對不起。我調整了我的代碼，但它仍然不起作用。在左上角，我有幾千個酒吧，在右上角只有幾個酒吧（這是不正確的，因爲在這兩個文件中約有10萬行）。 –

我希望你明白我們需要根據我們在這裏提供的信息來討論這個問題。因此，如果您的數據與您在問題中顯示的數據有所不同，那麼很難爲您提供幫助。然後，您需要創建一個新的[MCVE]，將數據量減少到可以在此分享的數量，但仍然會重現此問題。 – ImportanceOfBeingErnest

情節Seaborn Barplots在與Python

回答

相關問題