2012-07-04 16 views
0

我有一個網站是鏡像交叉兩個子域。所以我有兩個分開的分析數據集。 我有以下表格:複雜(對我來說)MySQL日期匹配

|------------------------------| 
| table_a      | 
|------------------------------| 
| url    | mod_date | 
|------------------------------| 
| /foo/index.html | 2009-10-24 | 
| /bar/index.php | 2010-01-04 | 
| /foo/bar.html | 2009-01-04 | 
|------------------------------| 

|-----------------------------------------| 
| table_b         | 
|-----------------------------------------| 
| url    | views | access_date | 
|-----------------------------------------| 
| /foo/index.html | 35000 | 2009-12-01 | 
| /foo/index.html | 20000 | 2010-02-01 | 
| /bar/index.php | 35000 | 2010-01-01 | 
| /bar/index.php | 15000 | 2011-01-01 | 
|-----------------------------------------| 

|-----------------------------------------| 
| table_c         | 
|-----------------------------------------| 
| url    | views | access_date | 
|-----------------------------------------| 
| /foo/index.html | 35000 | 2009-10-01 | 
| /foo/bar.html | 10000 | 2011-05-01 | 
| /bar/index.php | 35000 | 2011-08-01 | 
| /bar/index.php | 15000 | 2012-04-01 | 
|-----------------------------------------| 

我有以下查詢:

SELECT 
    a.url 
    ,DATE_FORMAT(a.mod_date, '%d/%m/%Y') AS 'mod_date' 
    ,DATE_FORMAT(MIN(b.access_date), '%d/%m/%Y') AS 'first_date' 
    ,DATE_FORMAT(MAX(b.access_date), '%d/%m/%Y') AS 'last_date' 
    ,SUM(ifnull(b.pages,0)) + SUM(ifnull(c.pages,0)) AS 'page_views'  
    ,DATEDIFF(MAX(b.access_date),MIN(b.access_date)) AS 'days' 
    ,ROUND(SUM(b.pages)/(DATEDIFF(MAX(b.access_date),MIN(b.access_date))/30.44)) AS 'b_mean_monthly_hits' 
    ,ROUND(SUM(c.pages)/(DATEDIFF(MAX(c.access_date),MIN(c.access_date))/30.44)) AS 'a_mean_monthly_hits' 
FROM 
    tabl_a a 
     LEFT JOIN 
    table_b b ON b.url = a.url 
     LEFT JOIN 
    table_c c ON c.url = a.url 
GROUP BY a.url 
HAVING ROUND(SUM(b.pages)/(DATEDIFF(MAX(b.access_date),MIN(b.access_date))/30.44)) < 5 
AND ROUND(SUM(c.pages)/(DATEDIFF(MAX(c.access_date),MIN(c.access_date))/30.44)) < 5 
; 

我在尋找的結果是:

|------------------------------------------------------------------------------------------| 
| results                     | 
|------------------------------------------------------------------------------------------| 
| url    | mod_date | first_date | last_date | page_views | avg_monthly_hits | 
|------------------------------------------------------------------------------------------| 
| /foo/index.html | 2009-10-24 | 2009-10-01 | 2010-02-01 | 90000  | 22273   | 
| /bar/index.php | 2010-01-04 | 2010-01-01 | 2012-04-01 | 85000  | 3275    | 
| /foo/bar.html | 2009-01-04 | 2011-05-01 | 2011-06-01 | 10000  | 9819    | 
|------------------------------------------------------------------------------------------| 

'avg_monthly_hits'是總和b.views and c.views天數除以(如「PAGE_VIEWS」)(不知道如何讓個月)從表-Btable_c由30.44分(平均數量最古老和最新ACCESS_DATE之間一個月中的天數)。

我希望我已經完全解釋了自己。 :)

+1

可以提供http://sqlfiddle.com –

+0

架構和你能解釋no_of_dates? –

回答

0

在嵌套查詢解決了這個問題的結束。

SELECT DISTINCT a.url 
, q.mod_date 
, IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date) AS 'min_date' 
, IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date) AS 'max_date' 
, (PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m')) + 1) AS 'months' 
, q.page_views 
, ROUND(q.page_views/((PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m'))) + 1)) AS 'avg_monthly_hits' 
FROM table_a a 
INNER JOIN 
    (SELECT 
      a.url, 
       a.date AS 'mod_date', 
       MIN(b.date) AS 'b_min_date', 
       MAX(b.date) AS 'b_max_date', 
       MIN(c.date) AS 'c_min_date', 
       MAX(c.date) AS 'c_max_date', 
       SUM(ifnull(b.pages, 0)) + SUM(ifnull(c.pages, 0)) AS 'page_views' 
     FROM 
      table_a a 
       LEFT JOIN 
      table_b b ON a.url = b.url 
       LEFT JOIN 
      table_c c ON a.url = c.url 
     GROUP BY a.url 
) q 
ON a.url = q.url 
WHERE ROUND(q.page_views/((PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m'))) + 1)) < 5 
; 
0

試試這個查詢。這將是不錯的一些最新測試它

select 
    a.*, 
    b.MinDate as `FirstDate`, 
    b.MaxDate as `LastDate`, 
    (ifnull(b.PSum,0) + ifnull(c.QSum,0)) as `TotalViews`, 
    datediff(b.MaxDate,b.MinDate) as `Diff`, 
    (((ifnull(b.PSum,0) + ifnull(c.QSum,0))/datediff(b.MaxDate,b.MinDate))/30.44) as `BMonthlyHits`, 
    (((ifnull(b.PSum,0) + ifnull(c.QSum,0))/datediff(b.MaxDate,b.MinDate))/30.44) as `CMonthlyHits` 
from table_a as a 
left join (select url , min(access_date) as MinDate,max(access_date)as MaxDate,sum(pages) as PSum from table_b group by url) as b on a.url = b.url 
left join (select url , min(access_date)as MinDate,max(access_date)as MaxDate, sum(pages) as QSum from table_c group by url) as c on a.url = c.url 
group by a.url 
HAVING BMonthlyHits < 5 and CMonthlyHits < 5 
0

如果表-B和table_c具有相同的結構,只是工會他們

SELECT 
a.url, 
DATE_FORMAT(a.mod_date, '%d/%m/%Y') AS 'mod_date', 
DATE_FORMAT(MIN(u.access_date), '%d/%m/%Y') AS 'first_date', 
DATE_FORMAT(MAX(u.access_date), '%d/%m/%Y') AS 'last_date', 
SUM(u.views) AS 'page_views', 
DATEDIFF(MAX(u.access_date), MIN(u.access_date)) AS 'days', 
ROUND(SUM(u.views)/(DATEDIFF(MAX(u.access_date),MIN(u.access_date))/30.44)) AS 'avg_monthly_hits' 
FROM table_a AS a 
LEFT JOIN (
    (SELECT * FROM table_b) 
    UNION 
    (SELECT * FROM table_c) 
) AS u USING (url) 
GROUP BY a.url 
HAVING avg_monthly_hits < 5