2011-09-11 116 views
5

首先,我有一個非常具體的問題,但也許對我的問題(第二部分)的替代方法也可以幫助我。迭代mysql中字符串的字符

有沒有辦法通過mysql中的索引來處理字符串中的字符。 (即在PHP $ var [2]中會給你第三個字符)?

顯而易見的方法是SUBSTRING(var, 3,1)但由於我的字符串是1024個字符長我認爲這不是最快的解決方案。如代碼示例中所示,使用子字符串檢索字符串的尾部也會獲得性能差異。有沒有可能迭代一個字符串? (移後的第一要素?)

CREATE FUNCTION hashDiff(hash1 TEXT(1024), hash2 TEXT(1024), threshold INT) 
RETURNS INT 
DETERMINISTIC 
BEGIN 
    DECLARE diff, x, b1, b2 INT; 
    SET diff =0; 
    SET x = 0; 
    WHILE (x<1024 AND diff<threshold) DO 
     SET b1 = ASCII(hash1); --uses first character only!! 
     SET b2 = ASCII(hash2); 
     SET hash1=SUBSTRING(hash1, 2); 
     SET hash2=SUBSTRING(hash2, 2); 
     SET diff=diff+ ((b1-b2)*(b1-b2)); 
     SET x=x+1; 
    END WHILE; 
    RETURN diff; 
END 

如果您尚未從代碼讀它,我試着寫一個存儲過程來計算哈希值之間的差或距離。差別是字符平方距離的總和(即hashDiff(AA,AC)=(65-65)²+(65-67)²=4)。如果哈希值已經不同,則可以通過引入一個閾值來取消計算來實現首要的性能提升。但是因爲mysql不是我的「每天」語言,所以我堅持在這一點上尋找其他優化。出於完整性兩個樣本哈希:

YAAAAAAYAAAYAAVAAQAARAOAAOAQASAQAMAKAKAJIAJAJIAHAHIAKJAIIAHHAHIIAIHGAGFFAGGFEAFEEEEAEDDDDDAEEEEDEEEFAFFFFFFEFFFEFFFFFGFEEFFEEEFFFJEFFEEEEEEELFFFFEEFJEEEEDIEEEEEIEEEEHEEEJEEFKFEFKGGFNHGOIIJTJKYONYNMTGHNHHQISJJQIKWLXJJSMYRQWJOGKDDFCCBBAAAAAAAAAAAAAAAAAAAAAAAYAAAAAAYAAAYAAWAARAASASAAQARAUAYAYATAOALKAJAJIAIAHHAHGAGFAFFAEFFAEFFAFFFAEEFFAFEEEDADEDDDDADDDCDDDDDAEEEFEEEEDDDEEEDEDDEEEFEFFGGFMFHGFFFGFFFLGHGGHGGNHHGGGOHGHGHMGGFGMFFFMFGFLFFFMGFFMGGMGGGNGGMGGLGGLGGMGGLEIEEHDCGCGCDGDGDCGDFCECCECECECECFCECFCFCFCFCFCGCJGYCYAAAAAAYAAAYAAUAATAAUAUAAUARARAQAPAPASARRAPARQAPAQQAQQAQSAKMATKKAIIHAIHGAGGGGAGHHGGAGGFGFFAFFGEFFFFFAFFGFGGGFFFEEFGFFGGFGGHIJJLKLWLKJJIJJJKJRLJKLKKKUKLLKKUMMKJIQIIIISKJJWKLLXMLMYMLNYMMYMLLWJIQIINFGKFFKEEIDHEDHDDFCECCFDECCFCFDGCDGCGCGEGCDCECECFDFCGDGCIEKEOAYNFBREUXKPQMMQTKT MMNJLPPVYYYTOUOPOLLJKKJJJIJIMJJJLIJJLLJIIHHIHHHIGHIHIHJHHHJHHIHGHGHFGHGFFEFEEEFEFEFFGGHIHIHGHGHHIIIIHIIJMNLONKLKKKKKKKMLKKLONMKOOOMLOPONMNMKKLLKKLMNKLMMMNMOPPOORPORSSVRTSSRTRRTSSTTXSTQRPONOKKLKLJMKJJIJIIHHHIIIJHIJIJJIJIKJIMWMYYDAAAAAAAAAAA

AAAAAAAAAAABAABAACAACACAACADADAEADADADADDAEAEEAEAFEAEEAEFAFGAGGGAGGGAHHHAHIIIAIHIJHAIIHIHHAJIHIJIJKJAJJJIKJJJJKKJKJKKLKLKLLMMMNNMYOOOOOOPOONYOONONNPYNOOOPYOOPPPYNONNYMLLWLLKUJIISHIHOGGMFGFLFFMGGLFGLGFLFFKFKFFLEEKFLEFJFKFGNGNHLFHJFIEGDIEKGOIRFGBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABABACACDACADDACADDADDAEDAFEAEEFAFFFAFFGAFGGGAIGHHIAHIHHHHAHIHIIJJIIAIIIIJIJKJIIIIJJHIIHIIIIJIIIIRJJJJKJJJJLVKLLKLLKXLMMKMXMLLLMWMMMMYMNLYMNNYNNMYMMNYMLYLMLXKJRIHPHIMGGMFEJEJEEIEEHDGCDFCFDCFCECECCEBEBECFDGCFDNGLDBAAAAAAAAAAAAAAAAAAAAAABAAAAABAABABAACACACACACACACADDADAEEAFAFGAFGAHGAGGAGGHAGGIAIHJAJJJJAJKKKKAMLMNNNANOMMNNMMNAONMNOOOMOOPOMNOMMNPOOPPPP RQQYPPRPPPPPNOYLLMMMMLYLMLMLYLMLMMYLNNMYNLLWMLKXLLLUKIKQIIQGHHPFHNGFLFFLGFJEEJEIDDIDCHDFCDGCFCCFCECECCECFCGDGDHDHDIFIDEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAABBBBBCBCCCCDCCCCCCCCDDDDEDEEEEFFDEGGHGHHHGHHHHHHIIJJJJJIJJJJJJIKJJKLKKMMNMMMMMMMNNNNNNLNNONPONNNOOOOPQQQRSSSSSSUTSTUUUVWVVXUYXWVXVXWYVYWYVYYUWVUTTSSPQPQOPOPONONOMONOOONNNMMNLJJKJIIJHHGGGFHFGFFFFEEEDDEEEEFGGIGJLRNEAAAAAAAAAAAAA

任何幫助或提示,將不勝感激。

+0

您可能能夠使用HEX函數一次處理四個字節的字符串。 –

回答

0

你將能夠使用一個排序數組的唯一方法是使用臨時表和遊標/結果集。

問題是你仍然需要迭代字符串並使用子字符串將它們分開。據我所知,沒有'wordwrap'或'explode'功能來切斷字符串。