我敢肯定,我的if/elif的語句可以改進,但是這是第 件事我能想到的。如果需要,我可以稍後回來。
@ or1426的解決方案會創建一個當前最長的已排序序列的列表,並在發現較長序列時將其複製到longest
。這會在每次找到更長的序列時創建一個新列表,並附加到每個字符的列表中。這在Python中實際上非常快,但請參見下文。
@ Deej的解決方案將當前最長的已排序序列保留在字符串變量中,並且每當找到更長的子字符串(即使它是當前序列的延續)時,子字符串都會保存到列表中。該列表最後有全部排序的原始字符串的子字符串,最長的是通過調用max
找到的。
這裏是一個更快的解決方案,只跟蹤當前最大序列的指數,當它發現一個字符不按排序順序只是更改最長:
def bjorn4(s):
# we start out with s[0] being the longest sorted substring (LSS)
longest = (0, 1) # the slice-indices of the longest sorted substring
longlen = 1 # the length of longest
cur_start = 0 # the slice-indices of the *current* LSS
cur_stop = 1
for ch in s[1:]: # skip the first ch since we handled it above
end = cur_stop-1 # cur_stop is a slice index, subtract one to get the last ch in the LSS
if ch >= s[end]: # if ch >= then we're still in sorted order..
cur_stop += 1 # just extend the current LSS by one
else:
# we found a ch that is not in sorted order
if longlen < (cur_stop-cur_start):
# if the current LSS is longer than longest, then..
longest = (cur_start, cur_stop) # store current in longest
longlen = longest[1] - longest[0] # precompute longlen
# since we can't add ch to the current LSS we must create a new current around ch
cur_start, cur_stop = cur_stop, cur_stop+1
# if the LSS is at the end, then we'll not enter the else part above, so
# check for it after the for loop
if longlen < (cur_stop - cur_start):
longest = (cur_start, cur_stop)
return s[longest[0]:longest[1]]
有多快?它幾乎是orl1426的兩倍,比deej快三倍。一如既往,取決於您的輸入。存在的排序子串的塊越多,上述算法與其他算法相比就越快。例如。上長度含有交替的100個隨機字符和100按順序字符100000的輸入字符串,我得到:
bjorn4: 2.4350001812
or1426: 3.84699988365
deej : 7.13800001144
如果我將其更改爲交替的1000個隨機字符和1000排序字符,然後我得到:
bjorn4: 23.129999876
or1426: 38.8380000591
deej : MemoryError
更新: 這裏是我的算法的進一步優化版本,與比較代碼:
import random, string
from itertools import izip_longest
import timeit
def _randstr(n):
ls = []
for i in range(n):
ls.append(random.choice(string.lowercase))
return ''.join(ls)
def _sortstr(n):
return ''.join(sorted(_randstr(n)))
def badstr(nish):
res = ""
for i in range(nish):
res += _sortstr(i)
if len(res) >= nish:
break
return res
def achampion(s):
start = end = longest = 0
best = ""
for c1, c2 in izip_longest(s, s[1:]):
end += 1
if c2 and c1 <= c2:
continue
if (end-start) > longest:
longest = end - start
best = s[start:end]
start = end
return best
def bjorn(s):
cur_start = 0
cur_stop = 1
long_start = cur_start
long_end = cur_stop
for ch in s[1:]:
if ch < s[cur_stop-1]:
if (long_end-long_start) < (cur_stop-cur_start):
long_start = cur_start
long_end = cur_stop
cur_start = cur_stop
cur_stop += 1
if (long_end-long_start) < (cur_stop-cur_start):
return s[cur_start:cur_stop]
return s[long_start:long_end]
def or1426(s):
longest = [s[0]]
current = [s[0]]
for char in s[1:]:
if char >= current[-1]: # current[-1] == current[len(current)-1]
current.append(char)
else:
current=[char]
if len(longest) < len(current):
longest = current
return ''.join(longest)
if __name__ == "__main__":
print 'achampion:', round(min(timeit.Timer(
"achampion(rstr)",
setup="gc.enable();from __main__ import achampion, badstr; rstr=badstr(30000)"
).repeat(15, 50)), 3)
print 'bjorn:', round(min(timeit.Timer(
"bjorn(rstr)",
setup="gc.enable();from __main__ import bjorn, badstr; rstr=badstr(30000)"
).repeat(15, 50)), 3)
print 'or1426:', round(min(timeit.Timer(
"or1426(rstr)",
setup="gc.enable();from __main__ import or1426, badstr; rstr=badstr(30000)"
).repeat(15, 50)), 3)
隨着輸出:
achampion: 0.274
bjorn: 0.253
or1426: 0.486
改變數據是隨機的:
achampion: 0.350
bjorn: 0.337
or1426: 0.565
和分類:
achampion: 0.262
bjorn: 0.245
or1426: 0.503
「不,不,它不是死的,它是靜止狀態」
嘗試'因爲我在枚舉:'應該工作在我的頭頂 – dgBP
「爲什麼會發生?」提示:如果'[i]'是字符串中的最後一個字符,那麼'[i + 1]'是什麼? – Kevin
這是因爲'for s in s'將從字面上將's'的元素作爲一個數組,所以第一個元素將是'c',第二個元素將是'n'等。您想要查找索引,所以你想得到你所在的元素的編號,'enumerate'應該這樣做。如果有效,我會將其作爲答案發布。 – dgBP