2013-01-14 16 views
0
SELECT nl.legal_name, nl.city, c.description 'Country', it.lei, sec.sym 
FROM name_loc    nl 
INNER JOIN ident_tbl_tmp it ON nl.fk_ident_id = it.id 
INNER JOIN sym_exch_cnty sec ON it.fk_sec_id = sec.id 
INNER JOIN countries  c ON nl.fk_cnty_id = c.id 
WHERE legal_name REGEXP '^For' 
limit 100; 

使用上述查詢將返回500+行數據。的部分輸出是:MySQL查詢,用於查找使用REGEXP重複以匹配前7個字符

輸出:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| FOREFRONT GROUP LTD HKD0.01(SUB   | PENDING  | HONG KONG  | NA     | 2903.HK  | 
| FOREFRONT HOLDINGS      | PENDING  | UNITED STATES | NA     | FFHN   | 
| FOREIGN & COL INV TR      | PENDING  | UNITED STATES | NA     | FLIVF  | 
| Foreign & Colonial Investment Trust  | PENDING  | NEW ZEALAND | NA     | FCT.NZ  | 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E1:B0I.SI | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E2:B0I.SI | 

我需要查詢時,第一個字符「n」匹配和國家都是一樣的,這將返回一個結果。

這將是對前7個字符相匹配的正確的結果:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E1:B0I.SI | 
| Foreland         | PENDING  | SINGAPORE  | NA     | E2:B0I.SI | 

這將是用於在第一14個字符相匹配的正確的結果:

+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| legal_name        | city   | Country  | lei     | sym   | 
+------------------------------------------+--------------+----------------+----------------------+--------------+ 
| Foreign & Colonial Investment Trust  | PENDING  | UNITED KINGDOM | NA     | FRCL.L  | 
| Foreign & Colonial Investment Trust PLC | London  | UNITED KINGDOM | 8VHDVYVI7W11JH2PAC61 | NA   | 

我曾嘗試過各種子查詢,但沒有運氣。我認爲我可能需要一個功能或程序,但我不確定。

+0

對於「外國和殖民地......」在前7個字符中如何與「前陸」相匹配,我感到困惑? –

+0

對不起,我的「Foreign&Colonial ...」中的前7個字符是「Foreign」,因此兩行的名稱基於legal_name中的前7個字符重複。 – John

回答

1

您可以簡單地GROUP BY Country, LEFT(legal_name, 7)。這將確保您只爲國家和名稱前綴的每個組合獲得一行輸出。你對哪一行將沒有影響。如果您想跟蹤原始行數,您甚至可以添加列COUNT(*) AS number_of_duplicates

+0

謝謝,這不是我正在尋找的,但它會工作。一個問題,我怎樣才能限制select(所打印的行)僅限於那些「COUNT(*)AS number_of_duplicates」大於1的行?我試過「SELECT if(count(*)> = 2,nl.legal_name,nl.city,c.description'Country',it.lei,sec.sym,COUNT(*)AS number_of_duplicates,'')」一條錯誤消息。謝謝。 – John

+0

@John:使用集合函數的重構與'HAVING'一起使用。所以你要麼寫'HAVING number_of_duplicates> 1'或'HAVING COUNT(*)> 1'。 – MvG

+0

很酷,感謝您的幫助 – John