2015-01-14 92 views
2

我正在嘗試編寫一個簡單的代碼來計算表中不同實例出現的百分比。 我可以一氣呵成嗎?百分比的配置單元計算

以下是我的代碼,它給了我錯誤。

select 100 * total_sum/sum(total_sum) from jav_test; 

回答

2

時,我不得不做類似的事情過去,這是我採取的辦法:

SELECT 
    jav_test.total_sum AS total_sum, 
    withsum.total_sum AS sum_of_all_total_sum, 
    100 * (jav_test.total_sum/withsum.total_sum) AS percentage 
FROM 
    jav_test, 
    (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum -- This computes sum(total_sum) here as a single-row single-column table aliased as "withsum" 
; 

total_sumsum_of_all_total_sum列的輸出中存在只是爲了說服自己正確的數學發生了 - 根據您在問題中發佈的查詢,您感興趣的數字是percentage

填充一個小假表後,這是結果:

hive> describe jav_test; 
OK 
total_sum     int         
Time taken: 1.777 seconds, Fetched: 1 row(s) 
hive> select * from jav_test; 
OK 
28 
28 
90113 
90113 
323694 
323694 
Time taken: 0.797 seconds, Fetched: 6 row(s) 
hive> SELECT 
    > jav_test.total_sum AS total_sum, 
    > withsum.total_sum AS sum_of_all_total_sum, 
    > 100 * (jav_test.total_sum/withsum.total_sum) AS percentage 
    > FROM jav_test, (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum; 
... 
... lots of mapreduce-related spam here 
... 
Total MapReduce CPU Time Spent: 3 seconds 370 msec 
OK 
28 827670 0.003382990805514275 
28 827670 0.003382990805514275 
90113  827670 10.887551802046708 
90113  827670 10.887551802046708 
323694  827670 39.10906520714777 
323694  827670 39.10906520714777 
Time taken: 41.257 seconds, Fetched: 6 row(s) 
hive> 
+0

感謝rchang您的回覆,我已經試過,但仍然得到同樣的錯誤。 我的代碼中的total_sum指的是一個具有一些值和總和(total_sum)的列給出單個值。 但是,當我運行命令它會產生錯誤 – Javad

+0

@Javad你收到的錯誤是什麼?我能夠用一個虛擬表(只有六行左右)運行查詢。 – rchang

+0

失敗:語義分析錯誤:第2行:2表達式不在GROUP BY鍵中jav_test – Javad