2012-09-10 40 views
2

我試圖全文搜索與標籤,但它並沒有正常工作對我來說赤附加的圖像,請enter image description hereMYsql FULLTEXT查詢產生意想不到的排名;爲什麼?

查詢是:

SELECT *, 
     MATCH(tags) AGAINST ('tag3 tag6 tag4') AS score 
    FROM items 
ORDER BY score DESC 

爲何成績不按照正確的順序排序字段?如果你檢查第二行有我搜索的所有標籤,而第一個字段沒有tag3關鍵字。

我的意思是標識字段順序應該是:5,1,2 ...等和NOT 1,5,2..etc

哪裏是我的錯?

然後我想首先在標籤字段中搜索,然後如果沒有結果我想搜索與FULLTEXT內部描述字段相同的關鍵字,那麼用戶將在標籤和描述中搜索標籤和描述,如果標籤不匹配,是否有可能在相同的查詢或我需要兩個分離的查詢?

+1

如果您要查詢的代碼是以圖片而不是文本形式顯示的,那麼對於回答者而言,您會更難。這意味着我們必須重新輸入它 - 脖子上的痛苦。 –

回答

2

在本文檔中http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html它說:「對於非常小的表格,單詞分佈沒有充分反映它們的語義值,並且此模型有時可能會產生奇怪的結果。」

如果您的物品表很小 - 例如樣品表 - 您可能會遇到這個問題並得到一個「奇怪」的結果。

您不妨試試這個查詢IN BOOLEAN MODE,看看您的結果是否符合您的預測。嘗試這個。

SELECT *, 
      MATCH(tags) AGAINST ('tag3 tag6 tag4' IN BOOLEAN MODE) AS score 
     FROM items 
    ORDER BY score DESC 

布爾模式會禁用單詞分配排名。注意,你應該理解自然語言和布爾模式之間的區別,一旦你有一個體面大小的表,明智地選擇使用哪一個。如果您正在尋找博客所擁有的標籤類型,布爾可能是一條可行的路。

+0

woooooooooow固定! – sbaaaang

+0

現在如果我需要在標籤和描述字段中進行搜索,並優先考慮標籤字段,該怎麼辦? :D – sbaaaang

+1

請閱讀有關全文搜索的參考手冊文檔。這是相當複雜和複雜的搜索。如果您同時需要布爾標籤式搜索和自然語言描述搜索,那麼您可能必須將標籤列標準化爲一個新表,每行只有一個標籤,並且只能使用FULLTEXT進行描述搜索。自然語言搜索的問題在於,它強調搜索排名中非常常見的詞彙。所以如果你有一些非常普通的標籤,它們可能不如你想要的那樣有效搜索。 –

0

修改訂單按評分DESC,編號DESC
假設得分的值相同,則帶有的行將首先顯示。

+0

那是對的,我無法理解的是1和5有相同的分數,導致5匹配所有關鍵字,而1匹配其中只有2個:/ – sbaaaang

+0

我認爲通過id訂購也不是一個好的修復 – sbaaaang

+0

由指定2列以排序,數據將按該順序顯示。所以,使用我的代碼,**分數**將進行比較;如果它們是相同的,那麼** id **將被比較;如果它們不同,則首先顯示較高的數字。 –

1

首先,這裏是您的示例數據加載到MySQL 5.5。12我的Windows7機器

mysql> DROP DATABASE IF EXISTS lspuk; 
Query OK, 1 row affected (0.00 sec) 

mysql> CREATE DATABASE lspuk; 
Query OK, 1 row affected (0.00 sec) 

mysql> USE lspuk 
Database changed 
mysql> CREATE TABLE items 
    -> (
    ->  id int not null auto_increment, 
    ->  description VARCHAR(30), 
    ->  tags VARCHAR(30), 
    ->  primary key (id), 
    ->  FULLTEXT tags_ftndx (tags) 
    ->) ENGINE=MyISAM; 
Query OK, 0 rows affected (0.04 sec) 

mysql> INSERT INTO items (description,tags) VALUES 
    -> ('the first' ,'tag1 tag3 tag4'), 
    -> ('the second','tag5 tag1 tag2'), 
    -> ('the third' ,'tag5 tag1 tag9'), 
    -> ('the fourth','tag5 tag6 tag2'), 
    -> ('the fifth' ,'tag4 tag3 tag6'), 
    -> ('the sixth' ,'tag2 tag3 tag6'); 
Query OK, 6 rows affected (0.00 sec) 
Records: 6 Duplicates: 0 Warnings: 0 

mysql> 

請看看標籤人口在MySQL中發生的方式:

mysql> SELECT 'tag1',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag1%' UNION 
    -> SELECT 'tag2',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag2%' UNION 
    -> SELECT 'tag3',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag3%' UNION 
    -> SELECT 'tag4',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag4%' UNION 
    -> SELECT 'tag5',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag5%' UNION 
    -> SELECT 'tag6',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag6%' UNION 
    -> SELECT 'tag9',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag9%'; 
+------+-----------+ 
| tag1 | tag_count | 
+------+-----------+ 
| tag1 |   3 | 
| tag2 |   3 | 
| tag3 |   3 | 
| tag4 |   2 | 
| tag5 |   3 | 
| tag6 |   3 | 
| tag9 |   1 | 
+------+-----------+ 
7 rows in set (0.00 sec) 

mysql> 

細心觀察,請注意以下事實:

  1. 每一行都有正好3個標籤
  2. 標籤被請求的順序與每個標籤存在多少似乎控制得分

如果刪除TAG4和運行查詢,你會得到所有

mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6') as score FROM items ORDER BY score DESC; 
+----+-------------+----------------+-------+ 
| id | description | tags   | score | 
+----+-------------+----------------+-------+ 
| 1 | the first | tag1 tag3 tag4 |  0 | 
| 2 | the second | tag5 tag1 tag2 |  0 | 
| 3 | the third | tag5 tag1 tag9 |  0 | 
| 4 | the fourth | tag5 tag6 tag2 |  0 | 
| 5 | the fifth | tag4 tag3 tag6 |  0 | 
| 6 | the sixth | tag2 tag3 tag6 |  0 | 
+----+-------------+----------------+-------+ 
6 rows in set (0.00 sec) 

的評價方法,似乎沒有得分是基於平均數令牌場和存在和/或不存在特定值的以特定的順序影響評分。如果您可以將不同風格的得分和標籤規範的,要注意各種得分:

mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6 tag4') as score FROM items ORDER BY score DESC; 
+----+-------------+----------------+--------------------+ 
| id | description | tags   | score    | 
+----+-------------+----------------+--------------------+ 
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 | 
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 | 
| 2 | the second | tag5 tag1 tag2 |     0 | 
| 3 | the third | tag5 tag1 tag9 |     0 | 
| 4 | the fourth | tag5 tag6 tag2 |     0 | 
| 6 | the sixth | tag2 tag3 tag6 |     0 | 
+----+-------------+----------------+--------------------+ 
6 rows in set (0.00 sec) 

mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6 tag4' IN BOOLEAN MODE) as score FROM items ORDER BY score DESC; 
+----+-------------+----------------+-------+ 
| id | description | tags   | score | 
+----+-------------+----------------+-------+ 
| 5 | the fifth | tag4 tag3 tag6 |  3 | 
| 1 | the first | tag1 tag3 tag4 |  2 | 
| 6 | the sixth | tag2 tag3 tag6 |  2 | 
| 4 | the fourth | tag5 tag6 tag2 |  1 | 
| 2 | the second | tag5 tag1 tag2 |  0 | 
| 3 | the third | tag5 tag1 tag9 |  0 | 
+----+-------------+----------------+-------+ 
6 rows in set (0.00 sec) 

mysql> SELECT *,MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score FROM items ORDER BY score DESC; 
+----+-------------+----------------+-------+ 
| id | description | tags   | score | 
+----+-------------+----------------+-------+ 
| 5 | the fifth | tag4 tag3 tag6 |  1 | 
| 1 | the first | tag1 tag3 tag4 |  0 | 
| 2 | the second | tag5 tag1 tag2 |  0 | 
| 3 | the third | tag5 tag1 tag9 |  0 | 
| 4 | the fourth | tag5 tag6 tag2 |  0 | 
| 6 | the sixth | tag2 tag3 tag6 |  0 | 
+----+-------------+----------------+-------+ 
6 rows in set (0.00 sec) 

mysql> 

的解決方案似乎是評價一個布爾MODE得分,然後一個非布爾模式得分如下:

SELECT *, 
MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1, 
MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score2 
FROM items ORDER BY score2 DESC, score1 DESC; 

這是對你的樣本數據的結果:

mysql> SELECT *, 
    -> MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1, 
    -> MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score2 
    -> FROM items ORDER BY score2 DESC, score1 DESC; 
+----+-------------+----------------+--------------------+--------+ 
| id | description | tags   | score1    | score2 | 
+----+-------------+----------------+--------------------+--------+ 
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 |  1 | 
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 |  0 | 
| 2 | the second | tag5 tag1 tag2 |     0 |  0 | 
| 3 | the third | tag5 tag1 tag9 |     0 |  0 | 
| 4 | the fourth | tag5 tag6 tag2 |     0 |  0 | 
| 6 | the sixth | tag2 tag3 tag6 |     0 |  0 | 
+----+-------------+----------------+--------------------+--------+ 
6 rows in set (0.00 sec) 

mysql> 

或者你可以嘗試不使用加號

mysql> SELECT *, 
    -> MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1, 
    -> MATCH(tags) AGAINST ('tag3 tag6 tag4' IN BOOLEAN MODE) as score2 
    -> FROM items ORDER BY score2 DESC, score1 DESC; 
+----+-------------+----------------+--------------------+--------+ 
| id | description | tags   | score1    | score2 | 
+----+-------------+----------------+--------------------+--------+ 
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 |  3 | 
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 |  2 | 
| 6 | the sixth | tag2 tag3 tag6 |     0 |  2 | 
| 4 | the fourth | tag5 tag6 tag2 |     0 |  1 | 
| 2 | the second | tag5 tag1 tag2 |     0 |  0 | 
| 3 | the third | tag5 tag1 tag9 |     0 |  0 | 
+----+-------------+----------------+--------------------+--------+ 
6 rows in set (0.00 sec) 

mysql> 

無論採用哪種方式,您都必須同時包含BOOLEAN MODE和非BOOLEAN模式。

相關問題