下面的代碼是一個包含溫度分組數據的樣本(記住這是一個人在醫院採取的溫度)從我們的源系統。如何處理壞數據的質量在一個SQL查詢
顯然,數據是可怕的,但不知道是否有可能以某種方式把這些數據轉化和INT,因爲我們有一個計量單位(計量單位)字段,所以我們只需要數。
數據問題:
88度顯然華氏和攝氏不是3635 將36.35 0.368將是36.8 37.3。將37.3 .37.7是37.7 377將37.7 .3.8將爲38
我覺得任何其它變化應該只是排除無效數據是公平爲不能準確地做出明智的假設。
DECLARE @Test TABLE (
[Temperature] VARCHAR(500),
[Count] VARCHAR(50)
)
INSERT INTO @Test ([Temperature],[Count])
VALUES
('34.4 oC',' 9 '),
('36.02 oC',' 1 '),
('36.36 oC',' 3 '),
('36.5 oC',' 5593 '),
('36.5. oC',' 1 '),
('36.6. oC',' 2 '),
('36.74 oC',' 2 '),
('36.82 oC',' 2 '),
('37.36 oC',' 2 '),
('37.49 oC',' 4 '),
('40 oC',' 1 '),
('88 oC',' 1 '),
(' 3635 oC',' 1 '),
(' .368 oC',' 1 '),
('33.5 oC',' 1 '),
('35.2 oC',' 84 '),
('35.20 oC',' 1 '),
('35.99 oC',' 1 '),
('36.35 oC',' 2 '),
('37.3. oC',' 1 '),
('39.5 oC',' 5 '),
('86 oC',' 1 '),
(' 356 oC',' 12 '),
(' 364 oC',' 72 '),
(' 379 oC',' 9 '),
(' 385 oC',' 2 '),
(' 3535 oC',' 1 '),
(' .37.7 oC',' 1 '),
('35.5 oC',' 290 '),
('35.87 oC',' 1 '),
('36..6 oC',' 1 '),
('36.25 oC',' 2 '),
('36.45 oC',' 2 '),
('36.62 oC',' 2 '),
('36.68 oC',' 5 '),
('36.8. oC',' 2 '),
('37.03 oC',' 5 '),
('37.1 oC',' 3610 '),
('37.16 oC',' 3 '),
('37.2 oCC000715799',' 1 '),
('37.27 oC',' 2 '),
('37.91 oC',' 1 '),
('38.9 oC',' 28 '),
('63.5 oC',' 1 '),
('71 oC',' 1 '),
(' 377 oC',' 8 '),
(' 36.5 oC',' 1 '),
(' 3.4 oC',' 3 '),
(' 3.7 oC',' 3 '),
('36.59 oC',' 1 '),
('36.67 oC',' 5 '),
('37.13 oC',' 1 '),
('37.18 oC',' 1 '),
('37.24 oC',' 1 '),
('39.7 oC',' 5 '),
('76 oC',' 2 '),
('80 oC',' 2 '),
(' 347 oC',' 1 '),
(' 352 oC',' 2 '),
(' 368 oC',' 64 '),
(' 3602 oC',' 1 '),
(' 3688 oC',' 1 '),
(' .36.4 oC',' 1 '),
(' .8 oC',' 1 '),
(' 3.2 oC',' 2 '),
('34.3 oC',' 5 '),
('34.9 oC',' 20 '),
('35 oC',' 124 '),
('35.81 oC',' 1 '),
('36.17 oC',' 2 '),
('36.23 oC',' 1 '),
('36.37 oC',' 2 '),
('36.38 oC',' 4 '),
('36.42 oC',' 1 '),
('36.76 oC',' 2 '),
('37..2 oC',' 1 '),
('37.00 oC',' 4 '),
('37.07 oC',' 6 '),
('37.12 oC',' 2 '),
('37.2 oC',' 3151 '),
('37.48 oC',' 2 '),
('39. oC',' 1 '),
('39.2 oC',' 9 '),
('39.9 oC',' 2 '),
(' 370 oC',' 1 '),
('30.1 oC',' 1 '),
('34.1 oC',' 2 '),
('34.8 oC',' 17 '),
('35.43 oC',' 1 '),
('36..8 oC',' 2 '),
('36.05 oC',' 1 '),
('36.21 oC',' 4 '),
('36.31 oC',' 2 '),
('36.41 oC',' 1 '),
('36.58 oC',' 8 '),
('36.8 oC',' 8134 '),
('36.81 oC',' 3 '),
('36.88 oC',' 2 '),
('36.89 oC',' 2 '),
('36.99 oC',' 4 '),
('37.01 oC',' 6 '),
('37.14 oC',' 3 '),
('37.33 oC',' 1 '),
('37.37 oC',' 6 '),
('37.44 oC',' 1 '),
('37.59 oC',' 2 '),
('38.5 oC',' 85 '),
('39.4 oC',' 9 '),
('78 oC',' 2 '),
('92 oC',' 1 '),
(' 361 oC',' 19 '),
(' 383 oC',' 1 '),
(' 391 oC',' 1 '),
(' 3642 oC',' 1 '),
(' 3699 oC',' 2 '),
(' 37.6 oC',' 1 '),
('35.59 oC',' 1 '),
('35.69 oC',' 1 '),
('35.90 oC',' 1 '),
('36..9 oC',' 1 '),
('36.08 oC',' 2 '),
('36.27 oC',' 1 '),
('36.365 oC',' 1 '),
('36.51 oC',' 1 '),
('36.78 oC',' 4 '),
('36.84 oC',' 1 '),
('36.85 oC',' 3 '),
('36.97 oC',' 2 '),
('37.29 oC',' 1 '),
('37.3 oC',' 2306 '),
('37.8 oC',' 730 '),
('38.08 oC',' 1 '),
('38.4 oC',' 113 '),
('38.49 oC',' 1 '),
('38.7 oC',' 53 '),
('39.3 oC',' 10 '),
('70 oC',' 2 '),
(' 357 oC',' 5 '),
(' 362 oC',' 49 '),
(' 396.8 oC',' 1 '),
(' 3700 oC',' 1 '),
(' 3752 oC',' 1 '),
(' .381 oC',' 1 '),
(' 0.37 oC',' 1 '),
(' 3.1 oC',' 1 '),
('14 oC',' 1 '),
('27 oC',' 1 '),
('34.2 oC',' 5 '),
('34.5 oC',' 22 '),
('35.9 oC',' 633 '),
('36.44 oC',' 2 '),
('36.57 oC',' 1 '),
('36.65 oC',' 1 '),
('36.66 oC',' 3 '),
('37.04 oC',' 7 '),
('65.9 oC',' 1 '),
('82 oC',' 2 '),
(' 118 oC',' 1 '),
(' 358 oC',' 6 '),
(' 381 oC',' 2 '),
(' 396.6 oC',' 1 '),
(' 3704 oC',' 1 '),
(' 3801 oC',' 1 '),
(' ',' 195340 '),
(' 362 oC',' 1 '),
(' .374 oC',' 1 '),
(' 3.6 oC',' 3 '),
('26.5 oC',' 1 '),
('35.0 oC',' 28 '),
('35.79 oC',' 1 '),
('36..7 oC',' 1 '),
('36.00 oC',' 2 '),
('36.18 oC',' 1 '),
('36.48 oC',' 4 '),
('36.49 oC',' 3 '),
('37.19 oC',' 2 '),
('37.46 oC',' 1 '),
('37.9 oC',' 465 '),
('38.12 oC',' 1 '),
('39 oC',' 25 '),
(' 351 oC',' 2 '),
(' 369. oC',' 1 '),
(' 389 oC',' 1 '),
(' 3736 oC',' 1 '),
(' NULL ',' 7 '),
('35.98 oC',' 1 '),
('36 oC',' 2948 '),
('36.28 oC',' 1 '),
('36.69 oC',' 1 '),
('36.72 oC',' 2 '),
('36.77 oC',' 4 '),
('36.98 oC',' 7 '),
('37.05 oC',' 3 '),
('37.06 oC',' 2 '),
('37.15 oC',' 3 '),
('37.25 oC',' 5 '),
('37.26 oC',' 3 '),
('37.39 oC',' 3 '),
('37.42 oC',' 1 '),
('37.68 oC',' 3 '),
('38.3 oC',' 160 '),
('38.6. oC',' 1 '),
(' 376 oC',' 18 '),
(' 3617 oC',' 1 '),
(' 3703 oC',' 1 '),
(' 3.8 oC',' 2 '),
(' 7.6 oC',' 1 '),
('30.6 oC',' 1 '),
('34 oC',' 3 '),
('34.7 oC',' 9 '),
('35.06 oC',' 1 '),
('35.7 oC',' 324 '),
('35.74 oC',' 1 '),
('36.01 oC',' 2 '),
('36.1 oC',' 1517 '),
('36.12 oC',' 1 '),
('36.4 oC',' 5001 '),
('36.6 oC',' 7044 '),
('36.79 oC',' 5 '),
('36.86 oC',' 1 '),
('36.90 oC',' 1 '),
('36.93 oC',' 1 '),
('37.30 oC',' 1 '),
('37.92 oC',' 1 '),
('38. oC',' 5 '),
('38.6 oC',' 65 '),
('38.8 oC',' 46 '),
('97 oC',' 1 '),
(' 354 oC',' 4 '),
(' 355 oC',' 5 '),
(' 365 oC',' 107 '),
(' 3654 oC',' 1 '),
('35.8 oC',' 495 '),
('36.09 oC',' 6 '),
('36.2 oC',' 2526 '),
('36.3. oC',' 1 '),
('36.47 oC',' 1 '),
('36.53 oC',' 2 '),
('36.9 oC',' 5449 '),
('37.0 oC',' 1209 '),
('37.1. oC',' 1 '),
('37.32 oC',' 2 '),
('37.38 oC',' 5 '),
('37.45 oC',' 1 '),
('37.5 oC',' 1477 '),
('37.6 oC',' 1101 '),
('37.80 oC',' 1 '),
('38.1 oC',' 215 '),
('40.2 oC',' 1 '),
('62 oC',' 1 '),
(' 366 oC',' 61 '),
(' 375 oC',' 28 '),
('16 oC',' 1 '),
('34.0 oC',' 1 '),
('35. oC',' 3 '),
('35.1 oC',' 61 '),
('35.23 oC',' 1 '),
('35.58 oC',' 2 '),
('36. oC',' 59 '),
('36.03 oC',' 1 '),
('36.16 oC',' 2 '),
('36.94 oC',' 2 '),
('37.08 oC',' 7 '),
('37.21 oC',' 1 '),
('37.47 oC',' 1 '),
('39.8 oC',' 3 '),
(' 346 oC',' 1 '),
(' 353 oC',' 2 '),
(' 369 oC',' 57 '),
(' 374 oC',' 28 '),
(' 3677 oC',' 1 '),
(' 37.4 oC',' 1 '),
('34.6 oC',' 15 '),
('35.3 oC',' 74 '),
('35.4 oC',' 120 '),
('35.6 oC',' 320 '),
('36.06 oC',' 1 '),
('36.07 oC',' 2 '),
('36.14 oC',' 1 '),
('36.19 oC',' 1 '),
('36.54 oC',' 1 '),
('36.71 oC',' 1 '),
('36.92 oC',' 1 '),
('37.50 oC',' 1 '),
('37.54 oC',' 1 '),
('37.7 oC',' 836 '),
('39.0 oC',' 8 '),
('39.6 oC',' 3 '),
('60 oC',' 1 '),
(' 127 oC',' 1 '),
(' 336.8 oC',' 1 '),
(' 1500 oC',' 1 '),
(' 36.4 oC',' 1 '),
('36.0 oC',' 829 '),
('36.3 oC',' 3192 '),
('36.56 oC',' 3 '),
('36.63 oC',' 2 '),
('36.7 oC',' 6348 '),
('36.73 oC',' 3 '),
('36.96 oC',' 4 '),
('37. oC',' 64 '),
('37.4 oC',' 1861 '),
('37.69 oC',' 1 '),
('38.01 oC',' 1 '),
('93 oC',' 1 '),
(' 351. oC',' 1 '),
(' 371 oC',' 24 '),
(' 372 oC',' 45 '),
(' 373 oC',' 30 '),
(' 3722 oC',' 1 '),
(' .3.8 oC',' 1 '),
('26.1 oC',' 1 '),
('35.97 oC',' 4 '),
('36.61 oC',' 3 '),
('37 oC',' 4890 '),
('37.02 oC',' 3 '),
('37.66 oC',' 1 '),
('38 oC',' 367 '),
('38.0 oC',' 72 '),
('38.2 oC',' 225 '),
('39.1 oC',' 22 '),
(' 359 oC',' 14 '),
(' 360 oC',' 3 '),
(' 363 oC',' 49 '),
(' 367 oC',' 112 '),
(' 378 oC',' 8 ')
Select
*
from @Test
你已經知道你的數據是什麼樣子,你會希望它是什麼喜歡。什麼阻止你改變它? –
沒有什麼問題,這真的是模型表中的數據,我需要得到一個有效的INT值到DW中。 – Simon
這歸結於使用'CASE','LIKE'實現你已經在T-SQL,它分解成很多的子問題組成的規則,'PATINDEX'等什麼給你的是「明顯的」將爲了計算機的利益必須精心編碼。對於更復雜的情況,您可能需要使用正則表達式的客戶端清理,因爲T-SQL的字符串操作不是很先進。 –