2011-06-24 61 views
20

我有一個遊戲的表,它被描述如下:在SQL表中如何刪除重複基於多個領域

+---------------+-------------+------+-----+---------+----------------+ 
| Field   | Type  | Null | Key | Default | Extra   | 
+---------------+-------------+------+-----+---------+----------------+ 
| id   | int(11)  | NO | PRI | NULL | auto_increment | 
| date   | date  | NO |  | NULL |    | 
| time   | time  | NO |  | NULL |    | 
| hometeam_id | int(11)  | NO | MUL | NULL |    | 
| awayteam_id | int(11)  | NO | MUL | NULL |    | 
| locationcity | varchar(30) | NO |  | NULL |    | 
| locationstate | varchar(20) | NO |  | NULL |    | 
+---------------+-------------+------+-----+---------+----------------+ 

但每場比賽在表中重複條目的地方,因爲每場比賽是在兩隊的時間表中。是否有一個sql語句可用於查看並刪除基於相同日期,時間,hometeam_id,awayteam_id,locationcity和locationstate字段的所有重複項?

回答

36

您應該可以執行相關的子查詢來刪除數據。查找所有重複的行,並刪除所有行,但是使用最小的id。對於MySQL,一個內連接需要使用,像這樣(的EXISTS功能等價物):

delete games from games inner join 
    (select min(id) minid, date, time, 
      hometeam_id, awayteam_id, locationcity, locationstate 
    from games 
    group by date, time, hometeam_id, 
       awayteam_id, locationcity, locationstate 
    having count(1) > 1) as duplicates 
    on (duplicates.date = games.date 
    and duplicates.time = games.time 
    and duplicates.hometeam_id = games.hometeam_id 
    and duplicates.awayteam_id = games.awayteam_id 
    and duplicates.locationcity = games.locationcity 
    and duplicates.locationstate = games.locationstate 
    and duplicates.minid <> games.id) 

爲了測試,用select * from games替換delete games from games。不要只在你的數據庫上運行一個刪除:-)

+0

我嘗試了這個選擇版本,它看起來正是我想要擺脫的,但是當我用「刪除」實際運行它時,它拋出一個錯誤並告訴我「錯誤代碼:1093。無法在FROM子句中指定目標表'遊戲'進行更新「任何想法? – cfrederich

+0

嘗試更新的答案,我用INNER JOIN刪除替換了EXISTS。我認爲mysql可能在刪除和EXISTS子句時遇到問題。 –

2

只要您沒有在選擇查詢中獲得表的id(主鍵)並且其他數據完全相同,則可以使用SELECT DISTINCT來避免重複結果。

4
select orig.id, 
     dupl.id 
from games orig, 
     games dupl 
where orig.date = dupl.date 
and orig.time = dupl.time 
and orig.hometeam_id = dupl.hometeam_id 
and orig. awayteam_id = dupl.awayeam_id 
and orig.locationcity = dupl.locationcity 
and orig.locationstate = dupl.locationstate 
and orig.id  < dupl.id 

這應該給你重複;您可以將其用作子查詢來指定要刪除的ID。

11

你可以嘗試這樣的查詢:

DELETE FROM table_name AS t1 
WHERE EXISTS (
SELECT 1 FROM table_name AS t2 
WHERE t2.date = t1.date 
AND t2.time = t1.time 
AND t2.hometeam_id = t1.hometeam_id 
AND t2.awayteam_id = t1.awayteam_id 
AND t2.locationcity = t1.locationcity 
AND t2.id > t1.id) 

這將在數據庫中留下具有最​​小ID每個遊戲例證的一個例子。

+1

給出語法錯誤。 –

+0

偉大的解決方案!但是,在最後一行,它應該是'<'爲了刪除最小的ID。 – nabroyan

1
DELETE FROM table 
WHERE id = 
    (SELECT t.id 
    FROM table as t 
    JOIN (table as tj ON (t.date = tj.data 
          AND t.hometeam_id = tj.hometeam_id 
          AND t.awayteam_id = tj.awayteam_id 
          ...)) 
+0

這是一個非常複雜的簡單版本'從表中刪除' – piotrpo

+0

oops,錯過了JOIN中的t.id <> tj.id。 – limscoder

2
delete from games 
    where id not in 
    (select max(id) from games 
    group by date, time, hometeam_id, awayteam_id, locationcity, locationstate 
    ); 

解決方法

select max(id) id from games 
    group by date, time, hometeam_id, awayteam_id, locationcity, locationstate 
into table temp_table; 

delete from games where id in (select id from temp); 
+1

這種方法只會刪除每個遊戲的一個重複行,而不管遊戲存在多少重複行。 –

+0

這給了我從@Neville K的帖子中得到的同樣的錯誤。 錯誤1093(HY000):您無法在FROM子句中爲更新指定目標表'遊戲' – cfrederich

+0

我不能編輯從子查詢中選取的內容嗎? – cfrederich

5

爲了得到重複的名單entried匹配兩場

select t.ID, t.field1, t.field2 
from (
    select field1, field2 
    from table_name 
    group by field1, field2 
    having count(*) > 1) x, table_name t 
where x.field1 = t.field1 and x.field2 = t.field2 
order by t.field1, t.field2 

並刪除所有重複的唯一

DELETE x 
FROM table_name x 
JOIN table_name y 
ON y.field1= x.field1 
AND y.field2 = x.field2 
AND y.id < x.id; 
+0

上面的查詢確實需要,但它從結果集中刪除了最後一行。所以我在查詢中進行了如下修改:DELETE x FROM table_name x JOIN table_name y ON y.field1 = x.field1 AND y.field2 = x.field2 AND y.id> x.id; – Vinayagam

7

最適合我的工作是重新創建表格。

CREATE TABLE newtable SELECT * FROM oldtable GROUP BY field1,field2; 

然後您可以重命名。

+2

這是迄今爲止最好和更直接的解決方案。你不能用這個錯誤。 – Codex73

+0

這樣做的一個缺點是你失去了約束,但是你可以'TRUNCATE' oldtable並將所有東西都從新表中複製回來,所以它的工作方式就像一個魅力 – Hissvard

+1

最安全的解決方案,比DELETE語句IMO更好。 –