比較從表中的字符串和每個關鍵字從另一個表

表1中包含的句子的列表：

Sentence 
---------------------------------------------------------------- 
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua 
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur

和表2中包含關鍵字和權重的列表： tblKeyword

Keyword         Weight 
---------------------------------------------------------------- 
dolor sit         1 
elit          3 
foobar          10

對於表1中的每個句子，我想獲得表2中的所有關鍵字，可以在發送發現重量的總和從表1中。我怎樣才能不使用光標在SQL中做到這一點？

預期的結果：

Result 
------ 
4 (sum of weight for sentence 1) 
6 (sum of weight for sentence 2) 
...

來源

2016-07-09 nikademus

預期結果是什麼。你是否嘗試過任何東西 –

嗨，我只想得到表1每個句子中可找到多少關鍵詞的總和。 – nikademus

由於體重可以是相對的，你能解釋一下嗎？ SOUNDEX，DIFFERENCE等可與LIKE和同義詞庫，詞典等進行比較。嘗試先做一些事情，或者閱讀MSDN並提出相關問題。畢竟，我們不是谷歌搜索。 –

這應該讓你開始：

SELECT 
    S.sentence 
    , SUM(K.weight) AS total_weight 
FROM Sentence S 
JOIN Keyword K 
    ON CHARINDEX(K.keyword, S.sentence) > 0 
GROUP BY S.sentence 
;

（對不起：。一個SQL Server實例無法訪問驗證試圖用MySQL和與CHARINDEX更換INSTR）

請評論，如果和因爲這需要調整/進一步的細節。

來源

2016-07-09 06:37:27 Abecee

是的，謝謝我認爲你的答案應該工作，我只是修改了一點在SQL Server 2014上運行：將CHARINDEX更改爲CHARINDEX（K.keyword，S.sentence） – nikademus

@nikademus：謝謝你的提示。對不起，疏忽了。固定。 – Abecee

不錯，但.... – Ajay2707

理想我不建議光標，而是按您的需求量的，我發現唯一的選擇就是光標。請檢查：

declare @table1 table (sentences varchar(1000)) 
declare @table2 table (keyword varchar(1000) , weight int) 

insert into @table1 values 
('Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua') 
,('Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur') 


insert into @table2 values 
('dolor sit' , 1),('elit' , 3),('foobar' , 10) 

/*Declare temp variable*/ 
declare @tab TABLE (sentSeqno int, result int) --to get final result create temporary table and insert temp data 

declare @keyword varchar(1000), @weight int , @Recordno int = 0 , @sentences varchar(1000); 

/*first cursor for table 1 for iteration of each row*/ 
DECLARE @cursorouter CURSOR ; 
SET @cursorouter = CURSOR FOR select * from @table1 
OPEN @cursorouter 
FETCH NEXT FROM @cursorouter INTO @sentences --FILL IN CURSOR TO LOOP THROUGH 
/* loop first cursor while having data*/ 
WHILE @@FETCH_STATUS = 0 
    BEGIN 

     /*Insert into temp data*/ 
     Set @Recordno = @Recordno + 1 insert into @tab values(@Recordno, 0) 

     /*Second cursor for table 2 to for iteration of each row*/ 
     DECLARE @cursorInner CURSOR ; 
     SET @cursorInner = CURSOR FOR 

     select * from @table2 
     OPEN @cursorInner 
     FETCH NEXT 
     FROM @cursorInner INTO @keyword , @weight --FILL IN CURSOR TO LOOP THROUGH 

     WHILE @@FETCH_STATUS = 0 
      BEGIN 
        /*to find the keyword in a sentence, we start loop and whenever found the keyword, we store its weightage*/ 
        Declare @pos int =0 
        Declare @oldpos int = 0 

        select @pos= patindex('%' + @keyword +'%' , @sentences) 
        /*While loop for each keyword found in sentence, we store the weightage into temp table*/ 
        while @pos > 0 and @oldpos <> @pos 
        begin 
         /*update temp data to store result*/ 
         update @tab set result = result+ @weight where sentSeqno = @Recordno 
         Select @oldpos = @pos 
         select @pos=patindex(@sentences,Substring(@keyword,@pos + 1,len(@keyword))) + @pos 
        end 

      FETCH NEXT 
      FROM @cursorInner INTO @keyword , @weight; 
     END 
     CLOSE @cursorInner; 
     DEALLOCATE @cursorInner; 

    FETCH NEXT 
     FROM @cursorouter INTO @sentences; 
    END 
    CLOSE @cursorouter; 
    DEALLOCATE @cursorouter; 

select * from @tab

來源

2016-07-09 06:36:02 Ajay2707

嗨，謝謝您的建議。我做了類似的事情，但是當我有更多的句子時（大約10000行），我發現性能非常慢。我的配置。是帶有XEON E3 1220 v3的SQL Server 2014。運行需要很長的時間。 – nikademus

比較從表中的字符串和每個關鍵字從另一個表

回答

相關問題