2013-06-23 45 views
0

好的,所以我有一個很好的小查詢來返回得分結果。查詢當前LIKE爲基礎,我想將其轉換爲全文查詢,作爲everyonekeeps告訴我。如果分數不相同,我想得到相同的結果順序。我已經能夠得到任何接近的唯一方法是通過展開我的交叉連接...將MySQL LIKE查詢轉換爲全文查詢

  • 我希望能夠設置分數特定單詞組合
  • 我希望能夠設置基於該術語的發現位置的權重
  • 我不想根據搜索中的單詞Power Set進行搜索。這就是說,如果用戶輸入「鐵路員工」,我不想在任何時候搜索「員工」。我試圖從查詢中只搜索連續的術語分組。

如何使我的原始查詢基於全文而仍然保持相對較小和組織?

您可以在SQLFiddle上查看這兩個查詢。

原始查詢 - 尼斯和小,得分和搜索字詞都在一次放置

SELECT 
    sum(score * multiplier) score, 
    a.id, 
    a.title 
FROM 
(
    SELECT 3 score, 'a railway employee' term UNION ALL 
    SELECT 2 score, 'railway employee' term UNION ALL 
    SELECT 2 score, 'a railway' term UNION ALL 
    SELECT 1 score, 'employee' term UNION ALL 
    SELECT 1 score, 'railway' term UNION ALL 
    SELECT 0 score, 'a' term 
) terms 
CROSS JOIN 
(
    SELECT 'T' TYPE, 1 multiplier 
    UNION ALL SELECT 'S', 1.1 
    UNION ALL SELECT 'C', 1.5 
) x 
INNER JOIN 
(
    SELECT id, 'T' TYPE, title SEARCH FROM articles 
    UNION ALL 
    SELECT id, 'S' TYPE, summary SEARCH FROM articles WHERE summary <> '' 
    UNION ALL 
    SELECT artId, 'C' TYPE, content SEARCH FROM articleSections 
) s ON s.TYPE = x.TYPE AND SEARCH LIKE concat('%', terms.term, '%') 
INNER JOIN articles a ON a.id = s.id 
WHERE score > 0 
GROUP BY id, title 
ORDER BY score DESC, title; 
; 

全文 - 凌亂,大,得分和搜索字詞是所有的地方

SELECT 
    sum(score * multiplier) score, 
    id, 
    title 
FROM 
(
SELECT 
    3 score, 
    1 multiplier, 
    'T' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(title) AGAINST ('"a railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1 multiplier, 
    'T' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(title) AGAINST ('"railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1 multiplier, 
    'T' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(title) AGAINST ('"a railway"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1 multiplier, 
    'T' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(title) AGAINST ('railway' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1 multiplier, 
    'T' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(title) AGAINST ('employee' IN BOOLEAN MODE) 
UNION ALL 


SELECT 
    3 score, 
    1 multiplier, 
    'S' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(summary) AGAINST ('"a railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1.1 multiplier, 
    'S' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(summary) AGAINST ('"railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1.1 multiplier, 
    'S' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(summary) AGAINST ('"a railway"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1.1 multiplier, 
    'S' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(summary) AGAINST ('railway' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1.1 multiplier, 
    'S' AS loc, 
    id, 
    title 
FROM articles 
WHERE MATCH(summary) AGAINST ('employee' IN BOOLEAN MODE) 
UNION ALL 


SELECT 
    3 score, 
    1.5 multiplier, 
    'C' AS loc, 
    id, 
    title 
FROM articleSections 
INNER JOIN articles a ON a.id = artId 
WHERE MATCH(content) AGAINST ('"a railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1.5 multiplier, 
    'C' AS loc, 
    id, 
    title 
FROM articleSections 
INNER JOIN articles a ON a.id = artId 
WHERE MATCH(content) AGAINST ('"railway employee"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    2 score, 
    1.5 multiplier, 
    'C' AS loc, 
    id, 
    title 
FROM articleSections 
INNER JOIN articles a ON a.id = artId 
WHERE MATCH(content) AGAINST ('"a railway"' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1.5 multiplier, 
    'C' AS loc, 
    id, 
    title 
FROM articleSections 
INNER JOIN articles a ON a.id = artId 
WHERE MATCH(content) AGAINST ('railway' IN BOOLEAN MODE) 
UNION ALL 
SELECT 
    1 score, 
    1.5 multiplier, 
    'C' AS loc, 
    id, 
    title 
FROM articleSections 
INNER JOIN articles a ON a.id = artId 
WHERE MATCH(content) AGAINST ('employee' IN BOOLEAN MODE) 

) t 
WHERE score > 0 
GROUP BY id, title 
ORDER BY score DESC, title; 
; 
+0

你有一套錯誤的要求。您列出的這些「要求」是人爲的,限制了您可以執行的各種解決方案。要求應該限制解決方案,而不是指定它們。請重新考慮您想要從搜索和編輯中獲得什麼。 –

+0

@LieRyan - 我希望能夠確定標題在結果中的表現方式和原因...做到這一點,我想確定如何得分和得分是什麼......如果我不關心什麼結果我回來了,我只是想在一個簡單的選擇結束時做一個WHERE MATCH,然後完成它。 – Justin808

+2

@ Justin808。 。 。鑑於您的得分需求,您可能不想使用全文搜索。或者,您可能希望使用全文搜索來查找包含關鍵字的行,然後使用'like'和'join'來累計分數。 –

回答

0

這是太長的評論。

顯然,你有非常具體的評分需求,既不符合搜索的自然語言模式也不符合布爾模式的搜索。我想知道在MySQL中是否有一些隱藏的機制會給你一個搜索的關鍵字匹配列表,然後你可以用它來進行評分。我不知道。

如果你有一個大的語料庫和比較少見的詞(意味着你正在尋找的詞是在相對較少的文檔中),那麼你可以使用布爾模式來減少搜索空間。這樣的查詢看起來像這樣:

select t.id, sum(terms.score * wherefactor.factor) 
from (select t.* 
     . . . 
     where MATCH(title, summary, content) AGAINST ('railway employee' IN BOOLEAN MODE) 
    ) t left outer join 
    (SELECT 3 score, 'a railway employee' term UNION ALL 
     SELECT 2 score, 'railway employee' term UNION ALL 
     SELECT 2 score, 'a railway' term UNION ALL 
     SELECT 1 score, 'employee' term UNION ALL 
     SELECT 1 score, 'railway' term UNION ALL 
     SELECT 0 score, 'a' term 
    ) terms cross join 
    (SELECT 'T' as which, 1.0 as factor UNION ALL 
    SELECT 'S', 1.1 UNION ALL 
    SELECT 'C', 1.5 
    ) wherefactor 
    on (case when wherefacctor.which = 'T' then title 
      when wherefactor.which = 'S' then subject 
      when wherefactor.which = 'C' then content 
     end) like concat('%', term, '%') 
group by t.id; 

這應該會給你全文搜索的性能以及你的評分算法的細節。

如果你有一個已知的詞典,另一種可能性是建立一個文檔項表。對於每個文檔以及您關心的文檔中的每個術語(這稱爲「詞典」),這樣的表格都會有一行。有了這樣的數據結構,您可以自由地實現您選擇的任何得分機制。