2016-04-26 181 views
2

爲了找到子串的位置,在一個字符串中,一個樸素的算法將花費O(n^2)時間。然而,使用一些高效的算法(如KMP algorithm),這可以在O(n)的時間來實現的:python str.index時間複雜度

s = 'saurabh' 
w = 'au' 

def get_table(): 
    i = 0; j = 2 
    t = [] 
    t.append(-1); t.append(0) 
    while i < len(w): 
     if w[i] == w[j-1]: 
      t.append(j+1) 
      j += 1 
     else: 
      t.append(0) 
      j = 0 
     i += 1 
    return t 

def get_word(): 
    t = get_table() 
    i = j = 0 
    while i+j < len(s): 
     if w[j] == s[i+j]: 
      if j == len(w) - 1: 
       return i 
      j += 1 
     else: 
      if t[j] > -1: 
       i = i + j - t[j] 
       j = t[j] 
      else: 
       i += 1 
    return -1 

if __name__ == '__main__': 
    print get_word() 

但是,如果我們這樣做:'saurabh'.index('ra'),它內部使用了一些有效的算法計算這O(n)或它使用複雜度爲O(n^2)的樸素算法?

+0

你可以配置文件,看看是否時間呈指數或線性增長; ) –

回答

2

在那篇文章[1]筆者穿過algoritm和解釋它。從文章:

The function 「fastsearch」 is called. It is a mix between 
Boyer-Moore and Horspool algorithms plus couple of neat tricks. 

而且從博耶 - 穆爾 - Horspool算法[2]的wiki頁面:

The algorithm trades space for time in order to obtain an 
average-case complexity of O(N) on random text, although 
it has O(MN) in the worst case, where the length of the 
pattern is M and the length of the search string is N. 

希望幫助!

[1] http://www.laurentluce.com/posts/python-string-objects-implementation

[2] https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm

+0

但是KMP的最壞情況時間仍然是線性的。這是否意味着我們應該使用KMP算法代替python的內建索引()來實現我們的代碼,以用於時間關鍵型流程? –

+0

我認爲那個主題對這個主題有一個很好的答案:http://programmers.stackexchange.com/questions/183725/which-string-search-algorithm-is-actually-the-fastest – alpert

1

有時你可以通過努力得到了迅速的回答

>>> timeit.timeit('x.index("ra")', setup='x="a"*100+"ra"') 
0.4658635418727499 
>>> timeit.timeit('x.index("ra")', setup='x="a"*200+"ra"') 
0.7199222409243475 
>>> timeit.timeit('x.index("ra")', setup='x="a"*300+"ra"') 
0.9555441829046458 
>>> timeit.timeit('x.index("ra")', setup='x="a"*400+"ra"') 
1.1994099491303132 
>>> timeit.timeit('x.index("ra")', setup='x="a"*500+"ra"') 
1.4850994427915793