2016-01-03 69 views

我想要在MYSQL中的分隔符之間得到每個不同的字符串值。我嘗試使用函數SUBSTRING_INDEX,它適用於第一個字符串和第一個字符串的延續,但不是第二個字符串。這裏就是我的意思:MYSQL SUBSTRING_INDEX提取列中的每個不同的字符串

Table x     The result 

enter image description here

SELECT SUBSTRING_INDEX(path, ':', 2) as p, sum(count) as N From x Group by p UNION 
SELECT SUBSTRING_INDEX(path, ':', 3) as p, sum(count) From x Group by p UNION 
SELECT SUBSTRING_INDEX(path, ':', 4) as p, sum(count) From x Group by p UNION 
SELECT SUBSTRING_INDEX(path, ':', 5) as p, sum(count) From x Group by p UNION 
SELECT SUBSTRING_INDEX(path, ':', 6) as p, sum(count) From x Group by p; 

我試着在查詢中加入SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 2), ':', 2) as p, sum(count) From x Group by p UNION SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 4), ':', 2) as p, sum(count) From x Group by p,但結果還是一樣。我所試圖做的就是不僅字符串的結果A1,A2,A3的組合,而且還串B2,C2,D2爲第一串取像見下表:

| p   | N | 
| :A1   | 4 | 
| ...   | ...| 
| :B1   | 3 | 
| :B1:C2  | 2 | 
|...   | ...| 



[SQL拆分值到多行(的可能的複製http://stackoverflow.com/questions/17942508/sql-split-values-to-multiple-行) –


沒有路徑開始於:B1,你能否澄清這個輸出 – amdixon


@RyanVincent會檢查它。 –





  • 從某些開始創建有效子的序列與端對每個塊使用固定長度爲2的路徑。
  • 將以上連接到自身以獲取不走到路徑末端的路徑
  • 使用上述方法計算的子串的索引取串上x.path
  • 聚集求和上述x.path子序列


create table x 
    path varchar(23) primary key not null, 
    count integer not null 

insert into x 
(path, count) 
(':A1:B2:C1:D1:G1' , 3), 
(':A1:B2:C1:D1:G4' , 1), 
(':A2:B1:C2:D2:G4' , 2) 

drop view if exists digits_v; 
create view digits_v 
select 0 as n 
union all 
select 1 union all select 2 union all select 3 union all 
select 4 union all select 5 union all select 6 union all 
select 7 union all select 8 union all select 9 


select substring(x.path, `start`, `len`) as chunk, sum(x.count) 
from x 
cross join 
    select o1.`start`, o2.`len` 
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len` 
    from digits_v seq 
    where 1 + 3 * seq.n between 1 and 15 
    and 15 - 3 * seq.n between 1 and 15 
) o1 
    inner join 
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len` 
    from digits_v seq 
    where 1 + 3 * seq.n between 1 and 15 
    and 15 - 3 * seq.n between 1 and 15 
) o2 
    on o2.`start` >= o1.`start` 
) splices 
where substring(x.path, `start`, `len`) <> '' 
group by substring(x.path, `start`, `len`) 
order by length(substring(x.path, `start`, `len`)), substring(x.path, `start`, `len`) 


|  chunk  | sum(x.count) | 
| :A1    |   4 | 
| :A2    |   3 | 
| :A3    |   3 | 
| ...    |   ... | 
| :A1:B2   |   4 | 
| :A2:B1   |   3 | 
| :A3:B3   |   2 | 
| :A3:B4   |   1 | 
| ...    |   ... | 
| :A1:B2:C1  |   4 | 
| :A2:B1:C2  |   2 | 
| :A2:B1:D2  |   3 | 
| :A3:B3:C4  |   2 | 
| :A3:B4:C2  |   1 | 
| ...    |   ... | 
| :A1:B2:C1:D1 |   4 | 
| :A2:B1:C2:D2 |   2 | 
| :A3:B3:C4:D3 |   2 | 
| :A3:B4:C2:D3 |   1 | 
| ...    |   ... | 
| :A1:B2:C1:D1:G1 |   3 | 
| :A1:B2:C1:D1:G4 |   1 | 
| :A2:B1:C2:D2:G4 |   2 | 
| :A3:B3:C4:D3:G7 |   2 | 
| :A3:B4:C2:D3:G7 |   1 | 



是的,輸出正是我所需要的。非常感謝!我會學習你的問題來理解這些功能。 –


主要是瞭解序列生成器和有效子串(來自固定長度的2個節點)。如果節點長度變長,則複雜度將進一步增加;) – amdixon
