2012-10-09 133 views
3

我有一些MySQL表,我想從提取一些信息最大值,該表是:MySQL的計算跨越多個列/表

  • 影片 - 代表了一個得分的視頻。
  • 標籤 - 包含全局標籤列表。
  • VideoTags - 創建視頻和標籤之間的關聯。

而且除了視頻資源,我也有圖片資源:

  • 照片 - 代表了一個得分的畫面。
  • PictureTopic - 創建圖片和主題之間的關聯。

而對於影片的所有權和圖片

  • 用戶一個用戶表 - 可以自己的視頻和圖片

我想要做的就是找到視頻或圖片最高每個標籤/主題的點數。有許多視頻和圖片具有相同的標記/主題,但我的結果集的行數與標記/主題的行數相同。最終目標是爲每個唯一標籤(標籤是以哈希爲前綴的主題)提供最佳視頻或圖片列表(按點數)。

從上一個問題的解決方案(http://stackoverflow.com/questions/12778329/mysql-data-extraction-from-3-tables-joins-and-max) 我能夠得到所有每個標籤的點數最高的視頻。

SELECT SUBSTR(Tags.content,2) as topic_id, Videos.id as resource_id, 'video' as resource_type, Videos.owner_id as resource_owner_id, Videos.points FROM Videos JOIN (
    SELECT VideoTags.tag_id, MAX(points) points 
    FROM  Videos JOIN VideoTags ON Videos.id = VideoTags.video_id 
    GROUP BY VideoTags.tag_id 
) t USING (points) JOIN Tags ON t.tag_id = Tags.id and Tags.content LIKE "#%" 

我也可以(有點)獲得最高分的圖片與此表達式每個主題:

SELECT PictureTopic.topic_id, Pictures.id as resource_id, 'picture' as resource_type, Pictures.owner_id as resource_owner_id, MAX(points) points 
FROM  Pictures JOIN PictureTopic ON Pictures.id = PictureTopic.picture_id 
GROUP BY PictureTopic.topic_id 

我想要的是得到的圖片或視頻的每個最高點標籤/話題,下面的邊緣情況處理:

  • 如果有一個以上的圖片或視頻的特定主題(即它們具有相同的高分),然後推遲到資源所有者的點,如果他們都也有相同的點(不太可能),那麼兩個資源都可以在結果集中(除非資源由同一用戶擁有,在這種情況下,結果集中只應該有一個結果)。
  • 如果視頻或圖片的點數小於20,則將該資源從結果集中排除。

作爲一個軟件開發者,他使用Grails很多我喜歡依賴於對象關係映射,因此我的SQL技能是跛腳的。我到目前爲止做的最好的是將兩個共同選擇的結果:

SELECT SUBSTR(Tags.content,2) as topic_id, Videos.id as resource_id, 'video' as resource_type, Videos.owner_id as resource_owner_id, Videos.points FROM Videos JOIN (
    SELECT VideoTags.tag_id, MAX(points) points 
    FROM  Videos JOIN VideoTags ON Videos.id = VideoTags.video_id 
    GROUP BY VideoTags.tag_id 
) t USING (points) JOIN Tags ON t.tag_id = Tags.id and Tags.content LIKE "#%" 
UNION 
SELECT PictureTopic.topic_id, Pictures.id as resource_id, 'picture' as resource_type, Pictures.owner_id as resource_owner_id, MAX(points) points 
FROM  Pictures JOIN PictureTopic ON Pictures.id = PictureTopic.picture_id 
GROUP BY PictureTopic.topic_id 

但不幸的是這還沒有得到高分圖片預期。由於可以在sqlfiddle可以看出(http://sqlfiddle.com/#!2/6650d/1

從這個查詢的輸出是:

TOPIC_ID RESOURCE_ID   RESOURCE_TYPE RESOURCE_OWNER_ID POINTS 
topic-1  owner-x-video-a  video   owner-x    20 
topic-2  owner-y-video-m  video   owner-y    44 
topic-1  owner-j-pic-1  picture   owner-j    50 
topic-3  owner-k-pic-2  picture   owner-k    22 

但我希望此行太:

TOPIC_ID RESOURCE_ID   RESOURCE_TYPE RESOURCE_OWNER_ID POINTS 
topic-3  owner-l-pic-3  picture   owner-l    22 

和平等高的邊緣情況後分數和分數閾值我想看看:

TOPIC_ID RESOURCE_ID   RESOURCE_TYPE RESOURCE_OWNER_ID POINTS 
topic-1  owner-j-pic-1  picture   owner-j    50 
topic-2  owner-y-video-m  video   owner-y    44 
topic-3  owner-l-pic-3  picture   owner-l    22 

這裏是模式和樣本數據以供參考:

CREATE TABLE `Users` (
    `id`  VARCHAR(24) NOT NULL DEFAULT '', 
    `points` DOUBLE  NOT NULL DEFAULT 0, 
    PRIMARY KEY (id) 
) Engine=InnoDB; 

DROP TABLE IF EXISTS `Videos`; 
CREATE TABLE `Videos` (
    `id` varchar(24) NOT NULL default '', 
    `owner_id` varchar(24) NOT NULL default '', 
    `points` DOUBLE NOT NULL default 0 
); 

DROP TABLE IF EXISTS `Tags`; 
CREATE TABLE `Tags` (
    `id` int(11) NOT NULL AUTO_INCREMENT, 
    `content` varchar(32) NOT NULL default '' 
PRIMARY KEY (id) 
); 

DROP TABLE IF EXISTS `VideoTags`; 
CREATE TABLE `VideoTags` (
    `video_id` varchar(24) NOT NULL default '', 
    `tag_id` int(11) NOT NULL 
); 

DROP TABLE IF EXISTS `Pictures`; 
CREATE TABLE `Pictures` (
    `id` varchar(24) NOT NULL default '', 
    `owner_id` varchar(24) NOT NULL default '', 
    `points` DOUBLE NOT NULL default 0 
); 

DROP TABLE IF EXISTS `PictureTopic`; 
CREATE TABLE `PictureTopic` (
    `picture_id` varchar(24) NOT NULL, 
    `topic_id` varchar(31) NOT NULL 
); 

INSERT INTO Users (id, points) VALUES ('owner-x', 0); 
INSERT INTO Users (id, points) VALUES ('owner-y', 0); 
INSERT INTO Users (id, points) VALUES ('owner-j', 0); 
INSERT INTO Users (id, points) VALUES ('owner-k', 5); 
INSERT INTO Users (id, points) VALUES ('owner-l', 14); 

INSERT INTO Videos (id,owner_id,points) VALUES 
    ('owner-x-video-a','owner-x', 20), 
    ('owner-x-video-b','owner-x', 15), 
    ('owner-y-video-k','owner-y', 12), 
    ('owner-y-video-l','owner-y', 17), 
    ('owner-y-video-m','owner-y', 44); 

INSERT INTO Tags (id, content) VALUES 
    (111, '#topic-1'), 
    (222, '#topic-2'); 

INSERT INTO VideoTags (video_id,tag_id) VALUES 
    ('owner-x-video-a',111), 
    ('owner-x-video-b',111), 
    ('owner-y-video-k',111), 
    ('owner-y-video-l',222), 
    ('owner-y-video-m',222); 

INSERT INTO Pictures (id, owner_id, points) VALUES ('owner-j-pic-1','owner-j', 50); 
INSERT INTO Pictures (id, owner_id, points) VALUES ('owner-k-pic-2','owner-k', 22); 
INSERT INTO Pictures (id, owner_id, points) VALUES ('owner-l-pic-3','owner-l', 22); 

INSERT INTO PictureTopic (picture_id, topic_id) VALUES ('owner-j-pic-1','topic-1'); 
INSERT INTO PictureTopic (picture_id, topic_id) VALUES ('owner-k-pic-2','topic-3'); 
INSERT INTO PictureTopic (picture_id, topic_id) VALUES ('owner-l-pic-3','topic-3'); 

有關如何最好地提取此信息的任何指針?乾杯:)

+0

你可以請你清楚你的情況我無法理解 –

回答

2
SELECT TOPIC_ID, RESOURCE_ID, RESOURCE_TYPE, RESOURCE_OWNER_ID, POINTS 
FROM ((SELECT pt.topic_id AS TOPIC_ID, 
      p.id AS RESOURCE_ID, 
      'picture' AS RESOURCE_TYPE, 
      p.owner_id AS RESOURCE_OWNER_ID, 
      p.points AS POINTS, 
      u.points AS user_points 
     FROM Pictures AS p 
     INNER JOIN PictureTopic AS pt 
     ON p.id = pt.picture_id 
     INNER JOIN Users AS u 
     ON p.owner_id = u.id) 
     UNION ALL 
    ( SELECT SUBSTR(t.content, 1), v.id, 'video', v.owner_id, v.points, u.points 
     FROM Videos AS v 
     INNER JOIN VideoTags AS vt 
     ON v.id = vt.video_id 
     INNER JOIN Tags AS t 
     ON vt.tag_id = t.id 
     INNER JOIN Users AS u2 
     ON v.owner_id = u2.id) 
     ORDER BY POINTS DESC, user_points DESC) AS h 
GROUP BY TOPIC_ID 
ORDER BY TOPIC_ID ASC 

查詢此查詢需要使用INNER JOINsubqueriesUNIONGROUP BY和非官方MySQL的假設,即GROUP BY將返回基於ORDER BY POINTS DESC第1行

+0

這是偉大的感謝@ robin-castlin,關於我如何推遲當用戶A的視頻和用戶B的圖片都具有相同的點時,向用戶A和B的點發現高分?正如用戶所有者-k'和'所有者-l'的樣本數據中的情況那樣,即它們都具有22點的圖片,因此屬於'所有者-1'的圖片應該是最高得分, l'Users表中的點數多於'owner-k'點數。 – Tetsuo

+0

只需在'Users'表中用'2 SELECTs'連接'JOIN'並用'user_points DESC'更新'ORDER BY'。以上查詢已更新。 –

+0

謝謝羅賓!您的支票在郵件中;) – Tetsuo

0

下面是視頻

select 
    t.content as `TOPIC-ID`, 
    vt.video_id as `RESOURCE-ID`, 
    'video' as `RESOURCE-TYPE`, 
    vt.owner_id as `RESOURCE-OWNER-ID`, 
    vt.MaxPoints 

from tags as t 
inner join 
    (SELECT 
     vt.tag_id, 
     vt.video_id, 
     MAX(v.points) as MaxPoints, 
     v.id, 
     v.owner_id 
    FROM videotags as vt 
    left join videos as v on v.id = vt.video_id 
    group by vt.tag_id 

    ) as vt on vt.tag_id = t.id 

union all 
SELECT 
    topic_id as `TOPIC-ID`, 
    picture_id as `RESOURCE-ID`, 
    'picture' as `RESOURCE-TYPE`, 
    p.owner_id as `RESOURCE-OWNER-ID`, 
    p.points as MaxPoints 
FROM picturetopic 
LEFT JOIN (SELECT id , owner_id , points FROM pictures) as p on p.id = picturetopic.picture_id