2015-05-13 16 views
2

我有一個看起來像這樣的表...如何創建一個採用列值取決於值是否構成大部分計數值的標誌?

id city  date 
1  chicago  5/1 
1  chicago  5/2 
1  new york 5/1 
2  new york 5/3 
2  seattle  . 
3  chicago  . 
4  seattle  . 
4  seattle  . 

我想創建一個第三列是採取「城市」,其中特定城市構成了大部分(> 51%)的值單個ID具有的條目數量。因此,例如,id#1將擁有favorite_city ='chicago'。我不知道哪裏可以開始...

幫助非常感謝。謝謝!

+0

附註 - 我有SE我已經寫出了用於處理沒有多數或不足以創建多數的ID的邏輯。 – Patricia

+0

因此,新列將具有前3行的'chicago'的值,那是你想要的嗎? – 54l3d

+0

正確。在我的其他代碼中,我將選擇不同的ID並添加該變量作爲'case when'語句的一部分。 – Patricia

回答

0
WITH 
    summary As 
(
    SELECT 
    your_table.*, 
    COUNT(*) OVER (PARTITION BY id) AS id_count, 
    COUNT(*) OVER (PARTITION BY id, city) AS id_city_count 
    FROM 
    your_table 
) 
SELECT 
    summary.*, 
    MAX(
    CASE WHEN id_city_count * 2 > id_count THEN city ELSE NULL END 
) 
    OVER (PARTITION BY id) 
FROM 
    summary 
0

這工作得很好,但給出了ID具有城市相等計數所有城市(不是唯一的一個),

with a as(select * from (
select id, city, nb, 
    rank() OVER (PARTITION BY id ORDER BY nb DESC) as rnk 
from(
select id, city, count(city) nb 
    from test 
group by id, city)as t group by id, city,nb) as tt where rnk =1) 
select test.id as id, test.city as city, a.city as favcity from 
test, a where test.id= a.id 

生命演示和輸出HERE

0

假設你已經添加新列你的表(在我的例子,其名字是test),你可以運行:

update test t 
    set t.favorite_city= 
     case 
      when 
       (select c.count from (select count(1) from test t_freq where t_freq.id=t.id group by city) as c order by 1 desc limit 1)/ 
       (select count(1) from test t_all where t_all.id=t.id) > 0.5 
      then 
       (select c.city from (select count(1), city from test t_freq where t_freq.id=t.id group by city) as c order by 1 desc limit 1) 
      else 
       null 
     end; 
相關問題