2014-08-27 138 views
0

我有一個SQL查詢A(見下文有詳細介紹)返回表如下:數據標準化

cluster brand amount 
0   bos  600 
0   phi  300 
0   har  100 
1   pro 2500 
1   wal 1500 
1   ash 1000 
2   dil 4200 
2   sor  500 
2   van  300 
... 

不過,我想顯示量不大,但該部分量相比,在羣集中的總量,像下表:

cluster brand amount 
0   bos 0.60 
0   phi 0.30 
0   har 0.10 
1   pro 0.50 
1   wal 0.30 
1   ash 0.20 
2   dil 0.84 
2   sor 0.10 
2   van 0.06 
... 

我應該如何改變我的SQL,這樣我可以克服所有款項訪問和在一個集羣中,而且還有多行與相同的羣集?

** **詳細

SQL服務器:MySQL中,通過Python-MySQL的連接器接口。

當前的SQL查詢來產生第一個表:

SELECT c.cluster, brand, COUNT(o.id) AS brand_amount 
FROM nyon_all.clustering AS c 
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id 
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid 
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id 
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id 
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35' 
GROUP BY cluster, brand 
HAVING brand_amount > 100 
ORDER BY c.cluster ASC, brand_amount DESC; 

orders(主鍵id)鏈接persons(外鍵pid)與articles(外鍵aid)。 Articles有一定的品牌(外鍵brand_id),它們與表brands中的名稱有關。

的每個羣集物品的總量可以用下面的SQL查詢來檢索:

SELECT c.cluster, COUNT(o.pid) AS amount 
FROM nyon_all.clustering AS c 
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id 
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid 
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35' 
GROUP BY cluster 
ORDER BY c.cluster ASC, amount DESC; 

結果:

cluster amount 
0  1000 
1  5000 
2  5000 

不過,我似乎無法給兩個SQL查詢相結合。

+2

數據是不是在SQL查詢表歸! :) – NoobEditor 2014-08-27 13:03:57

回答

1

你可以做聚類

一個子查詢聯接相加的金額
select t1.cluster, amount/sumAmount 
from Table1 t1 
join (select cluster, sum(amount) as sumAmount 
     from Table1 
     group by cluster)s 
on t1.cluster = s.cluster 

看到SqlFiddle

編輯

SELECT 
    c.cluster, 
    brand, 
    COUNT(o.id)/coalesce(s.sumBrandAmount, 0) AS brand_amount -- of course it would be nice to check for dividing by 0 
FROM nyon_all.clustering AS c 
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id 
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid 
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id 
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id 
LEFT JOIN (select c1.id, count(o1.id) as sumBrandAmount 
      from nyon_all.clustering c1 
      left join nyon_all.persons p1 on p1.id = c1.pid 
      left join nony_all.orders as o1 on o1.id = p1.id 
      --maybe some where clause as in your main query 
      group by c1.id) s 
           ON s.id = c.id 
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35' 
GROUP BY cluster, brand 
HAVING brand_amount > 100 
ORDER BY c.cluster ASC, brand_amount DESC; 
+0

感謝您的回答,但我不明白。我應該用我的大查詢替換Table1嗎?如果我嘗試這樣做,我會在「字段列表」中收到錯誤代碼1054:未知列「金額」。 我不熟悉SqlFiddle。該鏈接顯示我兩個空方塊。我該怎麼處理它? – physicalattraction 2014-08-27 13:28:25

+0

@physicalattraction似乎有一些問題與SqlFiddle(並不總是工作)...我會嘗試編輯我的答案與您的查詢。 – 2014-08-27 13:32:20

+0

@physicalattraction查看編輯答案。當然,爲了使事情更容易閱讀,你可以創建一個基於你的問題的查詢視圖,並使用它,而不是重寫所有... – 2014-08-27 13:38:13