2012-07-15 121 views
6

我有一個表,包含列date_trans,time_trans,價格。選擇查詢後,我想添加一個新的列「計數」,它將被計算爲價格列的連續相等值,並且具有連續相等價格的前一行將從最終結果中移除。看到預期的輸出:優化查詢或建議LINQ等效

date_trans time_trans price **Count**  
2011-02-22 09:39:59 58.02 1 
2011-02-22 09:40:03 58.1 *ROW WILL BE REMOVED 
2011-02-22 09:40:07 58.1 *ROW WILL BE REMOVED 
2011-02-22 09:40:08 58.1 3 
2011-02-22 09:40:10 58.15 1 
2011-02-22 09:40:10 58.1 *ROW WILL BE REMOVED 
2011-02-22 09:40:14 58.1 2 
2011-02-22 09:40:24 58.15 1 
2011-02-22 09:40:24 58.18 *ROW WILL BE REMOVED 
2011-02-22 09:40:24 58.18 *ROW WILL BE REMOVED 
2011-02-22 09:40:24 58.18 3 
2011-02-22 09:40:24 58.15 1 

請提出一個SQL查詢或LINQ表達式從表

目前選擇的,我能做到這一點是選擇查詢,並通過所有選定行循環,但選擇數以百萬計的時候行需要數小時。

我當前的代碼:

string query = @"SELECT date_trans, time_trans, price 
          FROM tbl_data 
         WHERE date_trans BETWEEN '2011-02-22' AND '2011-10-21' 
         AND time_trans BETWEEN '09:30:00' AND '16:00:00'"; 

      DataTable dt = oUtil.GetDataTable(query); 

      DataColumn col = new DataColumn("Count", typeof(int)); 
      dt.Columns.Add(col); 

      int priceCount = 1; 
      for (int count = 0; count < dt.Rows.Count; count++) 
      { 
       double price = Convert.ToDouble(dt.Rows[count]["price"]); 
       double priceNext = (count == dt.Rows.Count - 1) ? 0 : Convert.ToDouble(dt.Rows[count + 1]["price"]); 
       if (price == priceNext) 
       { 
        priceCount++; 
        dt.Rows.RemoveAt(count); 
        count--; 
       } 
       else 
       { 
        dt.Rows[count]["Count"] = priceCount; 
        priceCount = 1; 
       } 
      } 
+0

我認爲在SQL中可以使用分析函數。現在已經很晚了,所以我的大腦現在不能完全處理,但是當我休息時,我會回來看看你是否還需要一個答案。但我認爲你應該首先看看[這個答案](http://stackoverflow.com/questions/7854854/getting-all-consecutive-rows-differing-by-certain-value)以及它如何使用分析函數。 – Ally 2012-07-22 03:35:38

回答

2

這是一個有趣的一個。我認爲你需要會是這樣的:

SELECT MAX(date_trans), MAX(time_trans), MAX(price), COUNT(*) 
FROM 
    (SELECT *, ROW_NUMBER() OVER(PARTITION BY price ORDER BY date_trans, time_trans) - ROW_NUMBER() OVER(ORDER BY date_trans, time_trans) AS grp 
    FROM transactions) grps 
GROUP BY grp 

找到了解決辦法在這裏:http://www.sqlmag.com/article/sql-server/solution-to-the-t-sql-puzzle-grouping-consecutive-rows-with-a-common-element

UPDATE

分組列需要還包括「價格」,否則組可能不獨一無二。還有一件事是,日期和時間列應該合併到日期時間列中,以便最大日期時間值在從一天結束開始到下一開始結束的組中是正確的。 這是更正後的查詢。

SELECT MAX(CAST(date_trans AS DATETIME) + CAST(time_trans AS DATETIME)) , MAX(price), COUNT(*) 
FROM 
    (SELECT *, 
     CAST(ROW_NUMBER() OVER(PARTITION BY price ORDER BY date_trans, time_trans) - ROW_NUMBER() OVER(ORDER BY date_trans, time_trans) AS NVARCHAR(255)) + '-' + CAST(price AS NVARCHAR(255)) AS grp 
    FROM transactions 
    ORDER BY date_trans, time_trans) grps 
GROUP BY grp 

查詢可能更適合'grp'列作爲字節數組或bigint而不是nvarchar。你還提到了你可能想要在小組內彙總的'卷'欄。

+0

謝謝帕維爾。你快到了。 請下載csv,導入數據庫並檢查。有些記錄顯示重複。請通過改進您的查詢來幫助我。 https://docs.google.com/open?id=0B_fUxFgeU2-dc3hfR2JrR2ExQ2s 該列爲date_trans,time_trans,價格,CSV中的音量 – Mainuddin 2012-07-23 08:04:42

+0

正確。更新了答案。我沒有注意到它,但你標記了你的問題'mysql',但你的意思是'mssql',對吧? – 2012-07-23 10:46:52

+0

好的。修復。 – 2012-07-23 14:27:43