讓我們假設我們會以某種方式設法將所有的描述分開。 因此,而不是單行ID = 1和說明=「我的名字是薩吉德·坎」,我們不得不這樣
ID | Description
--- | ------------
1 | My
1 | NAME
1 | is
1 | Sajid
1 | KHAN
以這種形式
5行這將會是微不足道的,像
select Description, count(*) from data_in_new_form group by Description
所以,我們使用遞歸查詢來做到這一點。
create table mytable
as
select 1 as ID, 'My NAME is Sajid KHAN' as Description from dual
union all
select 2, 'My Name is Ahmed Khan' from dual
union all
select 3, 'MY friend name is Salman Khan' from dual
union all
select 4, 'test, punctuation! it is' from dual
;
with
rec (id, str, depth, element_value) as
(
-- Anchor member.
select id, upper(Description) as str, 1 as depth, REGEXP_SUBSTR(upper(Description), '(.*?)(|$)', 1, 1, NULL, 1) AS element_value
from mytable
UNION ALL
-- Recursive member.
select id, str, depth + 1, REGEXP_SUBSTR(str ,'(.*?)(|$)', 1, depth+1, NULL, 1) AS element_value
from rec
where depth < regexp_count(str, ' ')+1
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by element_value
;
請注意,該版本不會對標點符號做任何事情,假設詞語用空格分隔。採用分層查詢
with rec as
(
SELECT id, LEVEL AS depth,
REGEXP_SUBSTR(upper(description) ,'(.*?)(|$)', 1, LEVEL, NULL, 1) AS element_value
FROM mytable
CONNECT BY LEVEL <= regexp_count(description, ' ')+1
and prior id = id
and prior SYS_GUID() is not null
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by 2 desc
;
到目前爲止您嘗試了什麼? –