我在Windows 7筆記本電腦上使用PostgreSQL 9.6.1來編譯和分析來自不同來源的大型數據集。我的一位客戶注意到,在我提供給他們的最終報告中,她所在州的一些人正與其他州合併。PostgreSQL:強制執行表中的行順序
在本報告中,我創建了決賽桌:
CREATE UNLOGGED TABLE LPIS_IssuanceDetail (
ID SERIAL PRIMARY KEY,
Zone TEXT DEFAULT NULL,
State TEXT DEFAULT NULL,
LastName TEXT DEFAULT NULL,
FirstName TEXT DEFAULT NULL,
Email TEXT DEFAULT NULL,
UPN TEXT DEFAULT NULL,
LincPassUsed TEXT DEFAULT NULL,
EmployeeID TEXT DEFAULT NULL,
EmploymentType TEXT DEFAULT NULL,
NonEmployeeCategory TEXT DEFAULT NULL,
EmploymentStatus TEXT DEFAULT NULL,
ISAComplete TEXT DEFAULT NULL,
ISACompletionDate TIMESTAMP WITHOUT TIME ZONE,
LincPassStatus TEXT DEFAULT NULL,
ERO TEXT DEFAULT NULL,
Sponsored TEXT DEFAULT NULL,
Enrolled TEXT DEFAULT NULL,
Adjudicated TEXT DEFAULT NULL,
ShipToSite TEXT DEFAULT NULL,
ValidSite TEXT DEFAULT NULL,
CardExpiration DATE,
CertExpiration DATE,
LastEnrollment DATE,
EnrollmentExpiration DATE,
NewEnrollment TEXT DEFAULT NULL,
Sponsor TEXT DEFAULT NULL,
ContractEnd DATE,
ContractID TEXT DEFAULT NULL,
ContractPOC TEXT DEFAULT NULL
);
我然後填充這個表與從主數據表中的數據:
INSERT INTO LPIS_IssuanceDetail (
Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID,
EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete,
ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated,
ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration,
CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC
)
SELECT
Zone, StateName, MAS_LastName, MAS_FirstName, MAS_Email, MAS_UPN,
LincPassUsed, MAS_EmployeeID, MAS_Category, MAS_OrgRelType,
MAS_EmploymentStatus, ISAComplete, ISA_CompletionDate, MAS_IssuanceStatus,
MAS_FedEmerResponse, Sponsored, Enrolled, Adjudicated, MAS_ShipToCityState,
MAS_ValidShipToSite, MAS_CertExpireDate, MAS_LastEnrollmentDate, MAS_EnrollExpireDate,
MAS_CardExpireDate, MAS_NewEnrollment, MAS_Sponsor, MAS_PeriodofPerformanceEndDate,
MAS_ContractID, MAS_ContractPOC
FROM LPIS_MasterData
ORDER BY Zone, StateName, MAS_LastName, MAS_FirstName;
果然,當我滾動在這張表的下面,我發現單個記錄穿插在序列之外,就像這個樣本,其中緬因州的一條記錄不合適:
id | zone | state | lastname | firstname
11849 | 3 | Georgia | Roberts | George
11850 | 3 | Georgia | Smith | Dan
11922 | 3 | Maine | Edwards | John
11851 | 3 | Georgia | Snowden | Ed
11852 | 3 | Georgia | Williams | Casey
作爲測試,我甩只是前四列到一個單獨的表,就像這樣:
CREATE UNLOGGED TABLE LPIS_DetailTest (
ID SERIAL PRIMARY KEY,
Zone TEXT DEFAULT NULL,
State TEXT DEFAULT NULL,
LastName TEXT DEFAULT NULL,
FirstName TEXT DEFAULT NULL
);
INSERT INTO LPIS_DetailTest (
Zone, State, LastName, FirstName
)
SELECT
Zone, State, LastName, FirstName
FROM LPIS_IssuanceDetail
ORDER BY Zone, State, LastName, FirstName;
而且所有行的都是在預期的順序:
id | zone | state | lastname | firstname
11849 | 3 | Georgia | Roberts | George
11850 | 3 | Georgia | Smith | Dan
11851 | 3 | Georgia | Snowden | Ed
11852 | 3 | Georgia | Williams | Casey
11853 | 3 | Georgia | Spaid | Dennis
爲什麼會這樣較小表正確地使用相同的確切ORDER BY
子句作爲較大的表,其中一些行是無序的?
數據庫和所有表都設置爲UTF8。
我已經看過所有東西,並且不知道爲什麼ORDER BY
子句產生這樣奇怪的結果。我還能檢查什麼?
編輯:附加信息
在我的劇本,立即繼INSERT INTO ... SELERCT ...
語句,用COPY的數據轉儲到CSV文件,像這樣:當
-- Export data to CSV files
COPY LPIS_IssuanceDetail (
Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID,
EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete,
ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated,
ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration,
CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC
)
TO 'C:/Users/Michael.Sheaver/Documents/LincPass/Datasets/Compiled Reports/LPIS_IssuanceDetail.csv'
WITH (
FORMAT CSV,
DELIMITER ',',
NULL '',
HEADER TRUE,
QUOTE '"',
ENCODING 'UTF8'
);
然後我將該CSV文件導入電子表格以供最終演示,我必須手動對ID列中的數據進行排序,然後刪除該列。
新問題: 有沒有我可以在INSERT INTO使用聲明,將嚴格保護行的順序遵循什麼在ORDER BY子句指定的任何選項?
「*我向下滾動表*」 - 又是怎樣爲「向下滾動」的結果產生的?如果該選擇沒有'order by',則行的順序未定義。僅僅因爲你以特定順序插入行並不意味着'select'會按順序返回它們。只有***(真的:只有**)才能獲得一致的訂單,就是在選擇行時使用訂單。您在insert語句的源代碼中使用的'order by'實質上是無用的。 –
@a_horse_with_no_name,我懷疑是這樣,它讓我陷入了一個小窘境。緊隨SELECT語句之後,我使用COPY ... TO ....將已處理的數據集轉儲到CSV文件,並且COPY的語法不支持ORDER BY。 –