在sql中像在spss中一樣創建數據透視表

我在PostgreSQL中有很多數據。但我需要做一些數據透視表，就像SPSS一樣。例如，我有城市和州的表。在sql中像在spss中一樣創建數據透視表

create table cities 
(
    city integer, 
    state integer 
); 
insert into cities(city,state) values (1,1); 
insert into cities(city,state) values (2,2); 
insert into cities(city,state) values (3,1); 
insert into cities(city,state) values (4,1);

其實在這個表中我有4個城市和2個州。我想做的數據透視表有個像

city\state |state-1| state-2| 
city1  |33% |0%  | 
city2  |0%  |100% | 
city3  |33% |0%  | 
city4  |33% |0%  | 
totalCount |3  |1  |

我understant如何在SQL這種情況下，格外做到了這一點。但我想要的是通過另一個交叉變量（只計數不同的值，並通過「count（*）where variable_in_column_names = 1等等）使用一些存儲的函數進行區分。我正在尋找plpython。我的一些問題是：

如何使用沒有與形狀適合輸出列的數量和類型的臨時表輸出的記錄集。
也許有可行的解決方案？

我所看到的，輸入會是表名，第一個變量的列名，第二個變量的列名。在函數體中做很多查詢計數（*），通過變量中的每個不同值進行循環並對其進行計數等），然後返回帶有百分比的表格。

事實上，我有很多行的一個查詢（約10K）和可能會做這樣的事情在原蟒蛇，不plpython最好的方法是什麼？

來源

2012-12-10 norecces

檢查出來的'crosstab'功能'tablefunc'模塊：http://www.postgresql.org/docs/current/static/tablefunc.html –

伊夫看着交叉表之前，但它不是一個完整的解決方案，它只是簡化了輸入。由於我無法在交叉表中添加總計並向變量添加標籤。所以我認爲函數會像交叉表一樣返回表格，但我也必須做很多計算（總計，百分比等）。 – norecces

你可能想要給pandas一個嘗試，這是一個很好的python數據分析庫。

要查詢的PostgreSQL數據庫：

import psycopg2 
import pandas as pd 
from pandas.io.sql import frame_query 

conn_string = "host='localhost' dbname='mydb' user='postgres' password='password'" 
conn = psycopg2.connect(conn_string) 
df = frame_query('select * from cities', con=conn)

凡df是DataFrame這樣的：

city state 
0 1 1 
1 2 2 
2 3 1 
3 4 1

然後可以使用pivot_table和總除以得到的百分比創建數據透視表：

totals = df.groupby('state').size() 
pivot = pd.pivot_table(df, rows='city', cols='state', aggfunc=len, fill_value=0)/totals

給喲ü結果：

state 1 2 
city   
1 0.333333 0 
2 0   1 
3 0.333333 0 
4 0.333333 0

最後得到你想要的佈局，你只需要重命名索引和列，並追加總計：

totals_frame = pd.DataFrame(totals).T 
totals_frame.index = ['totalCount'] 

pivot.index = ['city%i' % item for item in pivot.index] 
final_result = pivot.append(totals_frame) 
final_result.columns = ['state-%i' % item for item in final_result.columns]

給你：

  state-1  state-2 
city1  0.333333 0 
city2  0.000000 1 
city3  0.333333 0 
city4  0.333333 0 
totalCount 3.000000 1

來源

2012-12-11 21:52:29

謝謝！熊貓適合我。工作接近完成。 – norecces

檢查PostgreSQL窗口函數。可能會給你一個非（pl）python解決方案。 http://blog.hashrocket.com/posts/sql-window-functions

來源

2012-12-15 23:21:40 Carlos

在sql中像在spss中一樣創建數據透視表

回答

相關問題