加入對file_fdw外部表和postgres_fdw外部表

在PostgreSQL 9.5：加入對file_fdw外部表和postgres_fdw外部表

我有一個名爲外部表：sheetheight（由file_fdw創建），並命名爲外國表：dzlog（由postgres_fdw創建）。

1-加盟申請表國外我有以下查詢：

SELECT * from dzlog INNER JOIN sheetheight ON dzlog.ullid = sheetheight.ullid;

和EXPLAIN ANALYZE再次返回這對於上面的查詢：

------------------------------------------------- 
Hash Join (cost=111.66..13688.18 rows=20814 width=2180) (actual time=7670.872. 
.8527.844 rows=2499 loops=1) 
    Hash Cond: (sheetheight.ullid = dzlog.ullid) 
    -> Foreign Scan on sheetheight (cost=0.00..12968.10 rows=106741 width=150) 
(actual time=0.116..570.571 rows=223986 loops=1) 
     Foreign File: D:\code\sources\sheetHeight_20151025_221244_0000000004987 
6878996.csv 
     Foreign File Size: 18786370 
    -> Hash (cost=111.17..111.17 rows=39 width=2030) (actual time=7658.661..765 
8.661 rows=34107 loops=1) 
     Buckets: 2048 (originally 1024) Batches: 32 (originally 1) Memory Usa 
ge: 4082kB 
     -> Foreign Scan on dzlog (cost=100.00..111.17 rows=39 width=2030) (ac 
tual time=47.162..7578.990 rows=34107 loops=1) 
Planning time: 8.755 ms 
Execution time: 8530.917 ms 
(10 rows)

查詢的輸出有兩列名爲ullid。

ullid，日期，顏色，sheetid，DZ0，DZ1，DZ2，DZ3，DZ4，DZ5，dz6，DZ7，ullid，sheetid，傳球，...

2-對於直接從python應用程序訪問csv文件和sql本地表，我有： 我已經做了相同的查詢不使用FDW，但直接從Python應用程序使用Pandas merge dataframe訪問csv文件和postgreSQL本地表。此連接生加入，讓我先取csv文件，然後使用熊貓庫從蟒蛇取SQL表，然後我合併基於通用列

import pandas as pd 
def rawjoin(query,connection=psycopg2.connect("dbname='mydb' user='qfsa' host='localhost' password='123' port=5433")): 
query=("SELECT * FROM dzlog;") 
    firstTable= pd.read_csv('.\sources\sheetHeight_20151025_221244_000000000498768789.csv', delimiter=';', header=0) 
    secondTable =pd.read_sql(query,connection) 
    merged= pd.merge(firstTable, secondTable, on= 'ullid', how='inner') 
    return merged

結果兩個dataframes是連接數據帶有一個ullid列的框架。

有關這種差異的任何想法？我還做了其他類型的連接，並從RAW訪問和外籍家政工人訪問的結果是相同的，其他查詢如下：

q7=("SELECT dzlog.color FROM dzlog,sheetheight WHERE dzlog.ullid = sheetheight.ullid;") 
q8=("SELECT sheetheight.defectfound FROM dzlog, sheetheight WHERE dzlog.ullid = sheetheight.ullid;") 
q9=("SELECT dzlog.color, sheetheight.defectfound FROM dzlog, sheetheight WHERE dzlog.ullid= sheetheight.ullid;")

來源

2017-04-06 User193452

我不知道你的第二個例子做什麼，所以很難說。使用哪個庫？它生成SQL還是應用程序中執行的連接（這幾乎總是性能損失）？如果這導致SQL語句，那麼聲明是什麼？

第一個查詢返回列兩次，因爲你要求它從所有涉及的表返回所有列和兩個表具有此列，其聯接條件的力量相等。

可以編寫SQL語句將輸出列只有一次這樣的：

SELECT * 
FROM dzlog 
    JOIN sheetheight 
     USING (ullid);

這看起來很像你的第二個例子的代碼，不是嗎？

來源

2017-04-06 10:33:39

是的，查詢是問題。您的查詢是正確的。 – User193452

加入對file_fdw外部表和postgres_fdw外部表

回答

相關問題