2016-11-16 54 views
0

我有以下查詢,在這裏我檢索特定項目的銷售數量和每天銷售的平均價格。如何計算使用group by時的內部連接字段的中位數?

SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price 
FROM sales INNER JOIN prices ON prices.id = sales.price_id 
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0)) 
GROUP BY date 
ORDER BY date ASC 

我也有一個for循環,做一個單獨的查詢每一天,以獲得中間價格(假設的結果數是偶數):

SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id 
WHERE prices.item_id = 7503 
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0)) 
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>" 
ORDER BY prices.price ASC 
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW> 

你可以想像,這是非常緩慢的,因爲在某些情況下,數百個查詢必須在大型表上進行(銷售表具有幾億行)。

如何重寫第一個SQL查詢,以便它也計算prices.price的中位數,類似於AVG(prices.price)?我查看了諸如this one等答案,但無法將我的頭圍繞如何適應我的特定場景。

我花了幾個小時試圖做到這一點,但我的SQL知識根本不夠好。任何幫助將不勝感激!

[email protected]:~# mysql -V 
mysql Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using EditLine wrapper 

表模式:從我第一次查詢輸出的

CREATE TABLE `prices` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT, 
`item_id` int(11) unsigned NOT NULL, 
`price` decimal(8,2) NOT NULL, 
`net_price` decimal(8,2) NOT NULL, 
`source` tinyint(4) NOT NULL, 
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', 
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', 
PRIMARY KEY (`id`), 
UNIQUE KEY `id` (`id`), 
KEY `prices_ibfk_1` (`item_id`), 
CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE 
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci 

CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT, 
`price_id` int(11) unsigned DEFAULT NULL, 
`item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL, 
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', 
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', 
PRIMARY KEY (`id`), 
UNIQUE KEY `id` (`id`), 
UNIQUE KEY `item_key` (`item_key`), 
KEY `price_id` (`price_id`), 
KEY `created_at` (`created_at`), 
KEY `price_id__created_at__IX` (`price_id`,`created_at`), 
CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE 
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci 

例子:

Example of output from my first query

+0

發表您的節目創建表的輸出 – e4c5

+0

請分享 - 價格的數據類型?每天最大行數? –

+0

@ e4c5我已經添加了創建表格輸出。 每天的最大行數取決於記錄的銷售數量。這可能是幾十萬。 – waylaidwanderer

回答

0

我找到了答案,我的問題here,經過廣泛的搜索。也許我最初沒有正確地說出我的問題。

我已經適應解決我自己的情況下,這裏的工作查詢:

SELECT COUNT(1) AS num_sales, 
     DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, 
     AVG(prices.price) AS avg_price, 
     CASE(COUNT(1) % 2) 
     WHEN 1 THEN SUBSTRING_INDEX(
      SUBSTRING_INDEX(
       group_concat(prices.price 
          ORDER BY prices.price SEPARATOR ',') 
       , ',', (count(*) + 1)/2) 
      , ',', -1) 
     ELSE (SUBSTRING_INDEX(
       SUBSTRING_INDEX(
        group_concat(prices.price 
            ORDER BY prices.price SEPARATOR ',') 
        , ',', count(*)/2) 
       , ',', -1) 
      + SUBSTRING_INDEX(
       SUBSTRING_INDEX(
        group_concat(prices.price 
            ORDER BY prices.price SEPARATOR ',') 
        , ',', (count(*) + 1)/2) 
       , ',', -1))/2 
     END median_price 
FROM sales 
    INNER JOIN prices ON prices.id = sales.price_id 
WHERE prices.item_id = 7381 
     AND (`prices`.`source` = 0 
      OR (`prices`.`price` >= 400 
       AND `prices`.`source` > 0)) 
GROUP BY date 
ORDER BY date ASC; 
相關問題