2014-07-24 14 views
0

我在Access中有一個包含SKU列和其銷售列的表格。銷售欄有空位,即空格或零,> = 3。零應被視爲空白,並且應該被刪除。間隙將被視爲> = 3空白或零。對於每個不同的SKU,我想查找其中的連續範圍的開始和結束&計數(結束 - 開始+ 1)。查詢在有空缺的列中查找範圍的開始和結束的SQL查詢

小例子:

SKU   SALES 
================== 
ABC  6504.00 
ABC  3304.23 
ABC  0 
ABC  0 
ABC   
ABC   
ABC  403.053 
ABC  3493.00 
ABC  3939.02 
DEF  4935.24 
DEF  3037.22 
DEF   
DEF   
DEF   
DEF  392.042 
DEF  0 
DEF  0 
DEF  3493.03 
DEF  8644.40 
DEF  643.035 
DEF  5333.22 

結果集:

SKU  RANGE  START  END COUNT 
ABC  1   1   2  2-1+1=2 
ABC  2   7   9  9-7+1=3 
DEF  1   10  11  11-10+1=2 
DEF  2   13  19  19-13+1=7 

此結果集然後應加入原表,以排除由具有計數範圍< = 13任何SKU行。只有SKU範圍內具有最大計數的SKU範圍應保存在表格/記錄集中。

我正在使用MSAccess,但任何人都可以證明這是一個Access查詢以及SQL Server查詢?編輯=========================

@Kevin嗨,

我終於得到了查詢工作,並給我的銷售周的正確範圍,但我需要一些幫助,現在就回連接到原來的臨時表只拉選擇性行。 JFYI,在運行此查詢之前,我更新了所有Sales KPI列以用NULL替換NULL(空白)。

USE MASTER 
GO 

WITH Salesrows AS 
(
SELECT 
    [SCOUNTRY], 
    [SCHAR], 
    [DESCRIPTION], 
    [SALES VALUE WITH INNOVATION]=IIF([SALES VALUE WITH INNOVATION] IS NULL,0,[SALES VALUE WITH INNOVATION]), 
    CONVERT(INT, SUBSTRING([WEEK], 8, 2)) Wk, 
    CONVERT(INT, SUBSTRING([WEEK], 3, 4)) Yr, 
    [wkno], 
    ROW_NUMBER() OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY [WEEK]) RN 
FROM STAGING 
WHERE ([Level] = 'Item') 
) 
,SalesRanges as 
(
SELECT *,   
    LAG([SALES VALUE WITH INNOVATION], 1) OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY RN) L1, 
    LAG([SALES VALUE WITH INNOVATION], 2) OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY RN) L2, 
    LEAD([SALES VALUE WITH INNOVATION], 1) OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY RN) L5, 
    LEAD([SALES VALUE WITH INNOVATION], 2) OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY RN) L6 
FROM SalesRows 
), 
Clearcontents as 
(
SELECT *, 
    (CASE WHEN ISNULL([SALES VALUE WITH INNOVATION], 0) = 0 AND ISNULL(L1,0) = 0 AND ISNULL(L2,0) = 0 THEN 1 ELSE 0 END) RemoveMe0, 
    (CASE WHEN ISNULL([SALES VALUE WITH INNOVATION], 0) = 0 AND ISNULL(L5,0) = 0 AND ISNULL(L6,0) = 0 THEN 1 ELSE 0 END) RemoveMe1, 
    (CASE WHEN ISNULL([SALES VALUE WITH INNOVATION], 0) = 0 AND ISNULL(L1,0) = 0 AND L2<>0 AND ISNULL(L5,0) = 0 AND L6<>0 THEN 1 ELSE 0 END) RemoveMe2 
FROM SalesRanges 
), 
CleanedData AS 
(
SELECT *, 
    ROW_NUMBER() OVER (PARTITION BY [SCOUNTRY],[SCHAR],[DESCRIPTION] ORDER BY yr, RN) NewRn 
FROM ClearContents 
WHERE RemoveMe0 != 1 and RemoveMe1 != 1 and RemoveMe2 != 1 
), 
WeekGaps as 
(
SELECT *, 
    (NewRn - Rn) Ref 
FROM CleanedData 
), 
CorrectWeekPeriods as 
(
SELECT 
    [SCOUNTRY], 
    [SCHAR], 
    [DESCRIPTION], 
    COUNT([wkno]) AS CNTWKS, 
    MIN([wkno]) AS MINWEEK, 
    MAX([wkno]) AS MAXWEEK, 
    REF 
FROM WeekGaps 
GROUP BY [SCOUNTRY],[SCHAR],[DESCRIPTION],[REF] 
) 
SELECT 
    C.[SCOUNTRY], 
    C.[SCHAR], 
    C.[DESCRIPTION], 
    CONVERT(INT, SUBSTRING(yw1.yrwk ,5,2)) WEEKS, 
    C.CNTWKS, 
    yw1.yrwk AS MINWEEK, 
    yw2.yrwk AS MAXWEEK 
FROM CorrectWeekPeriods AS C 
INNER JOIN yearweek AS yw1 ON C.MINWEEK = yw1.rn 
INNER JOIN yearweek AS yw2 ON C.MAXWEEK = yw2.rn 
--WHERE (C.CNTWKS > 13) AND (C.CNTWKS <= 52) 
--AND (C.CNTWKS=(SELECT MAX(A.CNTWKS) FROM CorrectWeekPeriods A WHERE C.[SCOUNTRY]=A.[SCOUNTRY] AND C.[SCHAR]=A.[SCHAR] AND C.[DESCRIPTION]=A.[DESCRIPTION])) 
--AND SUBSTRING(CAST(yw1.yrwk AS VARCHAR(6)),5,2) >= 1) 
--AND C.Description='0241004245' 
WHERE C.Description='0241004245' 
  1. 哪些字段CTE的我是否需要聯合起來到臨時表中字段只有這些選擇時段行顯示在表中?

  2. 我相信這個查詢可以被優化並且更加簡潔。但是如何?

  3. 另外,如果我評論最後WHERE從上面的CorrectWeekPeriods條款,並運行查詢多次,我得到不同的行數。我檢查了執行計劃並沒有發現任何錯誤。

,如果我只是取消註釋 WHERE子句:

WHERE (C.CNTWKS > 13) AND (C.CNTWKS <= 52) 
AND (C.CNTWKS=(SELECT MAX(A.CNTWKS) FROM CorrectWeekPeriods A WHERE C.[SCOUNTRY]=A. 

或者這一個:

WHERE C.Description='0241004245' 

獲得正確的分&最大周銷量範圍。

  1. 另外,如果我取消註釋

    WHERE C.說明=「0241004245」

我得到了執行計劃顯示錯誤:

/* 
Missing Index Details from SQL_Correct Gaps.sql - ABC.master (ALPHA\SIFAR (52)) 
The Query Processor estimates that implementing the following index could improve the query cost by 97.7228%. 
*/ 

/* 
USE [master] 
GO 
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] 
ON [dbo].[staging] ([Level],[Description]) 
INCLUDE ([Week],[Sales Value with Innovation],[sCountry],[sChar],[wkno]) 
GO 
*/ 

如果我堅持這最後的WHERE子句評論,我沒有得到這個錯誤。順便說一句,我已經創建了上述索引,所以不知道爲什麼它要求我再次創建相同的索引。爲什麼會發生這種情況?

此外,最後幾個評論的代碼是我試圖創建的規則,但無法編寫正確的代碼。這裏是規則:

  1. 如果有2米以上的SKU銷售一週的範圍,然後拿起一個最大(&更好,如果從2011年的第1周開始)。
  2. 排除了> 52的任何範圍,使它們達到< = 52。
  3. 如果所有SKU銷售周範圍> 13 & < = 52,那麼只保留最大的一個(如果從2011年第一週開始,則更好的爲&)。
  4. 排除任何範圍< = 13。

希望有人能指引我朝着正確的方向(特別是我的主要觀點1,回到臨時表提取適當的SKU銷售周範圍)。

編輯... 我只是註釋掉任何最後的WHERE子句再次:

WHERE (C.CNTWKS > 13) AND (C.CNTWKS <= 52) 
AND (C.CNTWKS=(SELECT MAX(A.CNTWKS) FROM CorrectWeekPeriods A WHERE C.[SCOUNTRY]=A.[SCOUNTRY] AND C.[SCHAR]=A.[SCHAR] AND C.[DESCRIPTION]=A.[DESCRIPTION])) 
AND SUBSTRING(CAST(yw1.yrwk AS VARCHAR(6)),5,2) >= 1 

,看着執行計劃。它顯示SORT & HASH上的警告。警告消息是:

Operator used tempdb to spill data during execution with spill level 1 

和每當我執行查詢,我得到不同的行數。查詢還需要〜1分鐘才能執行。我認爲它以某種方式與加入到年度表有關,但不知道如何解決此問題。

任何幫助將不勝感激。

嗨@kevin庫克,

下面是表的定義:

USE [master] 
GO 

/****** Object: Table [dbo].[staging] Script Date: 8/6/2014 11:27:29 PM ******/ 
DROP TABLE [dbo].[staging] 
GO 

/****** Object: Table [dbo].[staging] Script Date: 8/6/2014 11:27:29 PM ******/ 
SET ANSI_NULLS ON 
GO 

SET QUOTED_IDENTIFIER ON 
GO 

SET ANSI_PADDING ON 
GO 

CREATE TABLE [dbo].[staging](
    [Level] [varchar](5) NULL, 
    [Week] [varchar](9) NULL, 
    [Category] [varchar](50) NULL, 
    [Manufacturer] [varchar](50) NULL, 
    [Brand] [varchar](50) NULL, 
    [Description] [varchar](100) NULL, 
    [EAN] [varchar](100) NULL, 
    [Sales Value with Innovation] [float] NULL, 
    [Sales Units with Innovation] [float] NULL, 
    [Price Per Item] [float] NULL, 
    [Importance Value w Innovation] [float] NULL, 
    [Importance Units w Innovation] [float] NULL, 
    [Numeric Distribution] [float] NULL, 
    [Weighted Distribution] [float] NULL, 
    [Average Number of Item] [float] NULL, 
    [Value] [float] NULL, 
    [Volume] [float] NULL, 
    [Units] [float] NULL, 
    [Sales Value New Manufacturer] [float] NULL, 
    [Sales Value New Brand] [float] NULL, 
    [Sales Value New Line Extension] [float] NULL, 
    [Sales Value New Packaging] [float] NULL, 
    [Sales Value New Size] [float] NULL, 
    [Sales Value New Product Form] [float] NULL, 
    [Sales Value New Style Type] [float] NULL, 
    [Sales Value New Flavour Fragr] [float] NULL, 
    [Sales Value New Claim] [float] NULL, 
    [Sales Units New Manufacturer] [float] NULL, 
    [Sales Units New Brand] [float] NULL, 
    [Sales Units New Line Extension] [float] NULL, 
    [Sales Units New Packaging] [float] NULL, 
    [Sales Units New Size] [float] NULL, 
    [Sales Units New Product Form] [float] NULL, 
    [Sales Units New Style Type] [float] NULL, 
    [Sales Units New Flavour Fragr] [float] NULL, 
    [Sales Units New Claim] [float] NULL, 
    [filename] [nvarchar](260) NULL, 
    [importdate] [datetime] NULL CONSTRAINT [DF_staging_importdate] DEFAULT (getdate()), 
    [sCountry] [varchar](50) NULL, 
    [sChar] [varchar](50) NULL, 
    [yr] [int] NULL, 
    [wk] [int] NULL, 
    [wkno] [int] NULL 
) ON [PRIMARY] 

GO 

SET ANSI_PADDING OFF 
GO 
+2

什麼數據元素決定行順序? – VBlades

+0

有一個星期的專欄,每個專欄的年和周編號爲'W 2011 01',並且每年的運行週期爲52周(2011,2012,2013)。所以'W 2011 01'到'W 2011 52','W 2012 01'到'W 2012 52','W 2013 01'到'W 2013 52'。這可以用於行命令嗎? – sifar786

+0

我在這裏得到了大約70%,但對我來說這裏有點晚了。我在寫這篇文章時忘記了你的差距規則,所以沒有關於這個的邏輯。你必須弄清楚這一部分。希望這可以讓你朝着正確的方向前進。對不起,這只是SQL Server代碼。 http://pastebin.com/ejkfztx5 – Chris

回答

1

這適用於SQL Server 2012中,改變它的2008+,你就必須做SaleRows的幾個selfjoins在SaleRanges表格來處理LAG功能的用途。 下面是一些示例數據:

DECLARE @SalesTape TABLE 
( 
    SKU VARCHAR(10), 
    SALES DECIMAL(19,3), 
    YEARWEEK VARCHAR(10) 
) 

INSERT INTO @SalesTape 
VALUES 
('ABC', 6504.00, 'W 2011 01'), 
('ABC', 3304.23, 'W 2011 02'), 
('ABC', 0, 'W 2011 03'), 
('ABC', 0, 'W 2011 04'), 
('ABC', null, 'W 2011 05'), 
('ABC', null, 'W 2011 06'), 
('ABC', 403.053, 'W 2011 07'), 
('ABC', 3493.00, 'W 2011 08'), 
('ABC', 3939.02, 'W 2011 09'), 
('DEF', 4935.24, 'W 2011 10'), 
('DEF', 3037.22, 'W 2011 11'), 
('DEF', null, 'W 2011 12'), 
('DEF', null, 'W 2011 13'), 
('DEF', null, 'W 2011 14'), 
('DEF', 392.042, 'W 2011 15'), 
('DEF', 0, 'W 2011 16'), 
('DEF', 0, 'W 2011 17'), 
('DEF', 3493.03, 'W 2011 18'), 
('DEF', 8644.40, 'W 2011 19'), 
('DEF', 643.035, 'W 2011 20'), 
('DEF', 5333.22, 'W 2011 21'); 

我的第一CTE只是設置一些rownumbers並將銷售爲0,如果是null。

;WITH SaleRows AS 
(
    SELECT 
     SKU, 
     ISNULL(SALES, 0.0) SALES, 
     CONVERT(INT, SUBSTRING(YEARWEEK, 8, 2)) Wk, 
     CONVERT(INT, SUBSTRING(YEARWEEK, 3, 4)) Yr, 
     ROW_NUMBER() OVER (ORDER BY YEARWEEK) RN 
    FROM @SalesTape 
), 

這第二CTE建立在第一,着眼於前2行,並把銷售值的CTE

SaleRanges AS 
(
    SELECT 
     SaleRows.SKU, 
     SaleRows.SALES, 
     SaleRows.Wk, 
     SaleRows.Yr, 
     SaleRows.RN, 
     LAG(SALES, 2) OVER (ORDER BY RN) L2, 
     LAG(SALES, 1) OVER (ORDER BY RN) L1 
    FROM SaleRows 
), 

現在列,如果我的行前2行都是0.0 ,然後標記要移除的行。 (生成期間的中斷),我們將生成最新清理數據的新行號供以後使用。

ClearContent AS 
(
    SELECT *, 
     CASE WHEN L1 = 0.0 AND L2 = 0.0 AND ISNULL(SALES, 0.00) = 0.0 THEN 1 ELSE 0 END RemoveMe 
    FROM SaleRanges 
), 
CleanedData AS 
(
    SELECT 
     *, 
     ROW_NUMBER() OVER (PARTITION BY SKU ORDER BY RN) NewRn 
    FROM ClearContent 
    WHERE RemoveMe != 1 
) 

刪除無效行之後,我們將對一週的數學與我們的行偏移進行比較,並生成邏輯週期引用。

SELECT 
    SKU, 
    SALES, 
    Wk, 
    Yr, 
    (WK - NewRn) Ref 
FROM CleanedData 
WHERE SALES != 0.0 

這裏是輸出:

SKU SALES Wk Yr Ref 
ABC 6504.000 1 2011 0 
ABC 3304.230 2 2011 0 
ABC 403.053 7 2011 2 
ABC 3493.000 8 2011 2 
ABC 3939.020 9 2011 2 
DEF 4935.240 10 2011 9 
DEF 3037.220 11 2011 9 
DEF 392.042 15 2011 10 
DEF 3493.030 18 2011 10 
DEF 8644.400 19 2011 10 
DEF 643.035 20 2011 10 
DEF 5333.220 21 2011 10 

裁判給出了組,所以你只需要抓住的最小和最大WK每個裁判找到的第一個和最後一個記錄。你可以清理它並簡化它,但我想顯示這些步驟。希望這可以幫助。

+0

嗨@凱文,這是一個非常棒的答案!謝謝一堆。你是一位優秀的**老師**。 :-)如何刪除<= 13的行,並且只保留每個SKU的最大銷售週期範圍> 13和<= 52。每年的每週SKU都會運行1到52周。 – sifar786

+0

嗨@Kevin庫克,不知何故,我沒有得到正確的結果,當我從CSV創建的我的桌子上運行上述代碼。我附上了2個樣本CSV文件供您閱讀。 [鏈接](http://1drv.ms/1o2oX65)您可以將它們一起或僅一個導入到臨時(臨時)表中。 **描述**欄是** SKU **和**創新銷售值**是**銷售**欄。 ** YearWeek **列是** Week **列。請在'Level ='項目''上過濾。每週** **(SKU)的** Week **列從1到156。你能幫我脫機嗎?我是gmail dot com的sifar786。 – sifar786

+0

嗨@kevin,你能幫忙嗎? – sifar786