2016-12-15 40 views
0

我在Windows 7筆記本電腦上使用PostgreSQL 9.6.1來編譯和分析來自不同來源的大型數據集。我的一位客戶注意到,在我提供給他們的最終報告中,她所在州的一些人正與其他州合併。PostgreSQL:強制執行表中的行順序

在本報告中,我創建了決賽桌:

CREATE UNLOGGED TABLE LPIS_IssuanceDetail (
    ID SERIAL PRIMARY KEY, 
    Zone TEXT DEFAULT NULL, 
    State TEXT DEFAULT NULL, 
    LastName TEXT DEFAULT NULL, 
    FirstName TEXT DEFAULT NULL, 
    Email TEXT DEFAULT NULL, 
    UPN TEXT DEFAULT NULL, 
    LincPassUsed TEXT DEFAULT NULL, 
    EmployeeID TEXT DEFAULT NULL, 
    EmploymentType TEXT DEFAULT NULL, 
    NonEmployeeCategory TEXT DEFAULT NULL, 
    EmploymentStatus TEXT DEFAULT NULL, 
    ISAComplete TEXT DEFAULT NULL, 
    ISACompletionDate TIMESTAMP WITHOUT TIME ZONE, 
    LincPassStatus TEXT DEFAULT NULL, 
    ERO TEXT DEFAULT NULL, 
    Sponsored TEXT DEFAULT NULL, 
    Enrolled TEXT DEFAULT NULL, 
    Adjudicated TEXT DEFAULT NULL, 
    ShipToSite TEXT DEFAULT NULL, 
    ValidSite TEXT DEFAULT NULL, 
    CardExpiration DATE, 
    CertExpiration DATE, 
    LastEnrollment DATE, 
    EnrollmentExpiration DATE, 
    NewEnrollment TEXT DEFAULT NULL, 
    Sponsor TEXT DEFAULT NULL, 
    ContractEnd DATE, 
    ContractID TEXT DEFAULT NULL, 
    ContractPOC TEXT DEFAULT NULL 
); 

我然後填充這個表與從主數據表中的數據:

INSERT INTO LPIS_IssuanceDetail (
    Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID, 
    EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete, 
    ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated, 
    ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration, 
    CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC 
) 
SELECT 
    Zone, StateName, MAS_LastName, MAS_FirstName, MAS_Email, MAS_UPN, 
    LincPassUsed, MAS_EmployeeID, MAS_Category, MAS_OrgRelType, 
    MAS_EmploymentStatus, ISAComplete, ISA_CompletionDate, MAS_IssuanceStatus, 
    MAS_FedEmerResponse, Sponsored, Enrolled, Adjudicated, MAS_ShipToCityState, 
    MAS_ValidShipToSite, MAS_CertExpireDate, MAS_LastEnrollmentDate, MAS_EnrollExpireDate, 
    MAS_CardExpireDate, MAS_NewEnrollment, MAS_Sponsor, MAS_PeriodofPerformanceEndDate, 
    MAS_ContractID, MAS_ContractPOC 
FROM LPIS_MasterData 
ORDER BY Zone, StateName, MAS_LastName, MAS_FirstName; 

果然,當我滾動在這張表的下面,我發現單個記錄穿插在序列之外,就像這個樣本,其中緬因州的一條記錄不合適:

id  | zone | state | lastname | firstname 
11849 | 3 | Georgia | Roberts | George 
11850 | 3 | Georgia | Smith | Dan 
11922 | 3 | Maine | Edwards | John 
11851 | 3 | Georgia | Snowden | Ed 
11852 | 3 | Georgia | Williams | Casey 

作爲測試,我甩只是前四列到一個單獨的表,就像這樣:

CREATE UNLOGGED TABLE LPIS_DetailTest (
    ID SERIAL PRIMARY KEY, 
    Zone TEXT DEFAULT NULL, 
    State TEXT DEFAULT NULL, 
    LastName TEXT DEFAULT NULL, 
    FirstName TEXT DEFAULT NULL 
); 

INSERT INTO LPIS_DetailTest (
    Zone, State, LastName, FirstName 
) 
SELECT 
    Zone, State, LastName, FirstName 
    FROM LPIS_IssuanceDetail 
    ORDER BY Zone, State, LastName, FirstName; 

而且所有行的都是在預期的順序:

id  | zone | state | lastname | firstname 
11849 | 3 | Georgia | Roberts | George 
11850 | 3 | Georgia | Smith | Dan 
11851 | 3 | Georgia | Snowden | Ed 
11852 | 3 | Georgia | Williams | Casey 
11853 | 3 | Georgia | Spaid | Dennis 

爲什麼會這樣較小表正確地使用相同的確切ORDER BY子句作爲較大的表,其中一些行是無序的?

數據庫和所有表都設置爲UTF8。

我已經看過所有東西,並且不知道爲什麼ORDER BY子句產生這樣奇怪的結果。我還能檢查什麼?

編輯:附加信息

在我的劇本,立即INSERT INTO ... SELERCT ...語句,用COPY的數據轉儲到CSV文件,像這樣:當

-- Export data to CSV files 
COPY LPIS_IssuanceDetail (
    Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID, 
    EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete, 
    ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated, 
    ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration, 
    CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC 
) 
TO 'C:/Users/Michael.Sheaver/Documents/LincPass/Datasets/Compiled Reports/LPIS_IssuanceDetail.csv' 
WITH (
    FORMAT CSV, 
    DELIMITER ',', 
    NULL '', 
    HEADER TRUE, 
    QUOTE '"', 
    ENCODING 'UTF8' 
); 

然後我將該CSV文件導入電子表格以供最終演示,我必須手動對ID列中的數據進行排序,然後刪除該列。

新問題: 有沒有我可以在INSERT INTO使用聲明,將嚴格保護行的順序遵循什麼在ORDER BY子句指定的任何選項?

+2

「*我向下滾動表*」 - 又是怎樣爲「向下滾動」的結果產生的?如果該選擇沒有'order by',則行的順序未定義。僅僅因爲你以特定順序插入行並不意味着'select'會按順序返回它們。只有***(真的:只有**)才能獲得一致的訂單,就是在選擇行時使用訂單。您在insert語句的源代碼中使用的'order by'實質上是無用的。 –

+0

@a_horse_with_no_name,我懷疑是這樣,它讓我陷入了一個小窘境。緊隨SELECT語句之後,我使用COPY ... TO ....將已處理的數據集轉儲到CSV文件,並且COPY的語法不支持ORDER BY。 –

回答

1

如果你想在排序CSV文件中的數據,使用copyselect聲明:

COPY (select Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID, 
    EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete, 
    ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated, 
    ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration, 
    CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC 
    from LPIS_IssuanceDetail 
    ORDER BY Zone, State, LastName, FirstName 
) 
TO 'C:/Users/Michael.Sheaver/Documents/LincPass/Datasets/Compiled Reports/LPIS_IssuanceDetail.csv' 
WITH (FORMAT CSV, DELIMITER ',', NULL '', HEADER TRUE, QUOTE '"', ENCODING 'UTF8'); 
+0

我必須說這個解決方案非常簡單!在看到你的答案之後,我回到了PostgreSQL的COPY語句頁面,果然,埋在語法中的是(查詢)條目,當然我錯過了!您的幫助最受讚賞! –