2012-10-02 65 views
3

對不起,如果我的標題沒有正確描述我想要執行的任務。SQL:查找列值匹配的行之間的差異

對於一所大學的項目,我已經收到了網站的訪問日誌,我已經捨棄不需要的列,凝聚下來到這一點:

╔══════════╦══════════════════════╦═════════════════╦═════════════╦════════════════╗ 
║ accessid ║ date_time_in_seconds ║ yg_requester_id ║ referent_id ║ referent_docid ║ 
╠══════════╬══════════════════════╬═════════════════╬═════════════╬════════════════╣ 
║  2449 ║  2009011621830 ║   32276 ║  12648 ║    1 ║ 
║  2776 ║  2009011622726 ║   76360 ║  11070 ║    1 ║ 
║  2804 ║  2009011622783 ║   32276 ║  13845 ║    1 ║ 
║  2894 ║  2009011623025 ║   32276 ║  7222 ║    1 ║ 
║  2895 ║  2009011623037 ║   32276 ║  1530 ║    1 ║ 
║  3000 ║  2009011623406 ║   32276 ║  3728 ║    1 ║ 
║  3019 ║  2009011623497 ║   520060 ║  10356 ║    1 ║ 
║  3245 ║  2009011625780 ║   300841 ║  4607 ║    1 ║ 
║  3274 ║  2009011628309 ║   532664 ║  14377 ║    1 ║ 
║  3275 ║  2009011628420 ║   532664 ║  9097 ║    1 ║ 
╚══════════╩══════════════════════╩═════════════════╩═════════════╩════════════════╝ 

最初的時間和日戳是每單位單獨列測量(年,月,日,時,分,秒),更容易計算的目的,我已經將它們整合到一具有格式date_time_in_seconds

[0000][00][00][00000] 
[YEAR][MONTH][DAY][Number of Seconds since 00:00] 

ACCESSID的表項ID,yg_requester_id是獨特的身份證f網站訪問者,referent_id是他們閱讀的網站文章的ID,referent_docid表示文章的類型,但在此任務中不需要。

基本上,我希望能夠找到時間差異,因爲最後一個不同的referent_id被相同的yg_requester_id訪問。 因此,例如,看着從上面的表中的行本節:

╔══════════╦══════════════════════╦═════════════════╦═════════════╦════════════════╗ 
║ accessid ║ date_time_in_seconds ║ yg_requester_id ║ referent_id ║ referent_docid ║ 
╠══════════╬══════════════════════╬═════════════════╬═════════════╬════════════════╣ 
║  2449 ║  2009011621830 ║   32276 ║  12648 ║    1 ║ 
║  2776 ║  2009011622726 ║   76360 ║  11070 ║    1 ║ 
║  2804 ║  2009011622783 ║   32276 ║  13845 ║    1 ║ 
╚══════════╩══════════════════════╩═════════════════╩═════════════╩════════════════╝ 

yg_requester_id 32276訪問id爲文章在6點03分50秒(午夜後秒)於2009年1月16日,他們再訪問具有ID 文章在6點19分43秒(秒AF所以可以安全地假設用戶閱讀第一篇文章(編號爲)約15分鐘和50秒

我想找到的是時間差由同一用戶訪問的文章。由用戶讀取的連續文章可能沒有連續的accessid(儘管它會一直遞增)。我還想將讀取的時間限制爲大約一個小時,因爲任務是過濾掉下的時間爲時間不等的分鐘(例如15)。

在此先感謝,讓我知道是否需要

更多信息
+2

首先,不補存儲時間只是一個新的公約爲了它。您可以使用1970年以來的unix秒,或者可以將其作爲適當的日期時間字段存儲;超出原始組件字段。 – RichardTheKiwi

+0

謝謝,我只是合併了日期/時間測量值,因爲我認爲它會更容易計算。我仍然有原始數據,並將其轉換爲適當的日期時間數據類型。 – Edward

+0

你對上次訪問做了什麼,你會在持續時間與GetDate()進行比較嗎? – RichardTheKiwi

回答

2

我會用ROW_NUMBER由yg_requester_id劃分的結果集,並通過兩種ACCESSID或日期時間訂購它(假設你要改變你的date_time_in_seconds柱成。定期datetime列,如意見建議 然後,我會通過請求者和以前的紀錄加盟,與結果本身,並獲得差異

讓我嘗試編寫查詢,而不正確的數據。

SELECT X1.yg_requester_id, DATEDIFF(SECOND, X1.NewDateTimeField, X2.NewDateTimeField) AS TimeDifferenceInSeconds, X1.referent_id AS NewArticle, X2.referent_id AS FormerArticle 
FROM 
(
SELECT ROW_NUMBER() OVER(PARTITION BY yg_requester_id ORDER BY NewDateTimeField DESC) AS Position, NewDateTimeField, yg_requester_id, referent_id 
FROM YourTable 

) X1 
INNER JOIN 
(
SELECT ROW_NUMBER() OVER(PARTITION BY yg_requester_id ORDER BY NewDateTimeField DESC) AS Position, NewDateTimeField, yg_requester_id, referent_id 
FROM YourTable 
) X2 ON X2.yg_requester_id = X1.yg_requester_id AND X2.Position = X1.Position - 1 
+0

非常感謝,這工作完美。 – Edward

+0

@愛德華不客氣。您能否將答案標記爲有效答案?非常感謝。 – Jaime

0

此查詢應該檢索請求,指涉和被採取對所指的請求,以秒爲時間差:

select abc.A_requestor as requestor_id,abc.B_refer as referent_id,abc.A_datetime-abc.B_datetime as time_difference from 
(select a.accessid as A_accessid ,b.accessid as B_accessid, 
a.yg_requestor_id as A_requestor,a.date_time_in_seconds as A_datetime,a.referent_id as A_refer, 
b.yg_requestor_id as B_requestor,b.date_time_in_seconds as B_datetime,b.referent_id as B_refer 
from weblog a 
inner join weblog b 
on a.yg_requestor_id = b.yg_requestor_id 
and a.date_time_in_seconds > b.date_time_in_seconds 
and a.referent_id != b.referent_id) abc 

inner join 

(select cte.B_accessid,min(cte.A_accessid) as C_accessid from 
(select a.accessid as A_accessid ,b.accessid as B_accessid, 
a.yg_requestor_id as A_requestor,a.date_time_in_seconds as A_datetime,a.referent_id as A_refer, 
b.yg_requestor_id as B_requestor,b.date_time_in_seconds as B_datetime,b.referent_id as B_refer 
from weblog a 
inner join weblog b 
on a.yg_requestor_id = b.yg_requestor_id 
and a.date_time_in_seconds > b.date_time_in_seconds 
and a.referent_id != b.referent_id) cte 
group by cte.B_accessid) xyz 

on xyz.B_accessid = abc.B_accessid and xyz.C_accessid = abc.A_accessid