爲BigQuery的傳統的SQL
SELECT
keyword,
COUNT(1) AS rows,
SUM(INTEGER((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword))) AS occurences
FROM YourTable
CROSS JOIN keywords
WHERE Text CONTAINS keyword
GROUP BY keyword
實施例與
SELECT
keyword,
COUNT(1) AS rows,
SUM(INTEGER((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword))) AS occurences
FROM (
SELECT Text FROM
(SELECT 'facebookfacebookcnnbbccnn' AS Text),
(SELECT 'facebook' AS Text),
(SELECT 'cnn' AS Text)
) AS words
CROSS JOIN (
SELECT keyword FROM
(SELECT 'facebook' AS keyword),
(SELECT 'cnn' AS keyword),
(SELECT 'bbc' AS keyword)
) AS keywords
WHERE Text CONTAINS keyword
GROUP BY keyword
對於大量查詢標準SQL播放(見Enabling Standard SQL)
SELECT
keyword,
COUNT(1) AS `rows`,
SUM((LENGTH(Text) - LENGTH(REPLACE(Text, keyword, '')))/LENGTH(keyword)) AS occurences
FROM YourTable
JOIN keywords
ON STRPOS(Text, keyword) > 0
GROUP BY keyword
實施例與
打
「Text LIKE CONCAT('%',keyword,'%')」是危險的,因爲關鍵字可能包含需要轉義的特殊字符。這也不是很高效。在這裏使用更好的函數將是「STRPOS(Text,keyword)> 0」 –
同意,更新! –
這完美的作品!謝謝米哈伊爾。另外 - 有沒有辦法讓這個查詢掃描兩列的關鍵字?例如A列:文本,B列:Text_2 –