如何創建一個採用列值取決於值是否構成大部分計數值的標誌？

我有一個看起來像這樣的表...如何創建一個採用列值取決於值是否構成大部分計數值的標誌？

id city  date 
1  chicago  5/1 
1  chicago  5/2 
1  new york 5/1 
2  new york 5/3 
2  seattle  . 
3  chicago  . 
4  seattle  . 
4  seattle  .

我想創建一個第三列是採取「城市」，其中特定城市構成了大部分（> 51％）的值單個ID具有的條目數量。因此，例如，id＃1將擁有favorite_city ='chicago'。我不知道哪裏可以開始...

幫助非常感謝。謝謝！

來源

2015-05-13 Patricia

附註 - 我有SE我已經寫出了用於處理沒有多數或不足以創建多數的ID的邏輯。 – Patricia

因此，新列將具有前3行的'chicago'的值，那是你想要的嗎？ – 54l3d

正確。在我的其他代碼中，我將選擇不同的ID並添加該變量作爲'case when'語句的一部分。 – Patricia

WITH 
    summary As 
(
    SELECT 
    your_table.*, 
    COUNT(*) OVER (PARTITION BY id) AS id_count, 
    COUNT(*) OVER (PARTITION BY id, city) AS id_city_count 
    FROM 
    your_table 
) 
SELECT 
    summary.*, 
    MAX(
    CASE WHEN id_city_count * 2 > id_count THEN city ELSE NULL END 
) 
    OVER (PARTITION BY id) 
FROM 
    summary

來源

2015-05-13 22:17:08 MatBailie

這工作得很好，但給出了ID具有城市相等計數所有城市（不是唯一的一個），

with a as(select * from (
select id, city, nb, 
    rank() OVER (PARTITION BY id ORDER BY nb DESC) as rnk 
from(
select id, city, count(city) nb 
    from test 
group by id, city)as t group by id, city,nb) as tt where rnk =1) 
select test.id as id, test.city as city, a.city as favcity from 
test, a where test.id= a.id

生命演示和輸出HERE

來源

2015-05-13 17:26:23 54l3d

假設你已經添加新列你的表（在我的例子，其名字是test），你可以運行：

update test t 
    set t.favorite_city= 
     case 
      when 
       (select c.count from (select count(1) from test t_freq where t_freq.id=t.id group by city) as c order by 1 desc limit 1)/ 
       (select count(1) from test t_all where t_all.id=t.id) > 0.5 
      then 
       (select c.city from (select count(1), city from test t_freq where t_freq.id=t.id group by city) as c order by 1 desc limit 1) 
      else 
       null 
     end;

來源

2015-05-13 21:44:18

如何創建一個採用列值取決於值是否構成大部分計數值的標誌？

回答

相關問題