給定字符串S中的最小字符數應該被刪除以使其成爲排序字符串

我需要找到使字符串排序所需的最小刪除次數。給定字符串S中的最小字符數應該被刪除以使其成爲排序字符串

樣品測試情況：

# Given Input: 
teststr = "abcb" 
# Expected output: 
1 

# Explanation 
# In this test case, if I delete last 'b' from "abcb", 
# then the remaining string "abc" is sorted. 
# That is, a single deletion is required. 

# Given Input: 
teststr = "vwzyx" 
# Expected output: 
2 

# Explanation 
# Here, if I delete 'z' and 'x' from "vwzyx", 
# then the remaining string "vwy" is a sorted string.

我嘗試以下，但它提供了時間限制超出錯誤。任何其他解決此問題的方法？

string = input() 
    prev_ord = ord(string[0]) 
    deletion = 0 
    for char in string[1:]: 
     if ord(char) > prev_ord +1 or ord(char) < prev_ord: 
      deletion += 1 
      continue 
     prev_ord = ord(char) 
    print(deletion)

來源

2016-10-01 Kishan

有這個問題，這是被稱爲[最長遞增子（一個相當不錯的Python實現https://en.wikipedia.org/wiki/Longest_increasing_subsequence ）問題了（這裏）[http://stackoverflow.com/questions/3992697/longest-increasing-subsequence] – chthonicdaemon

感謝您找到這個愚蠢的目標，@chthonicdaemon。 –

我已經爲我的答案添加了_much_更高效的版本。 –

您當前的算法將給出不正確的結果對於很多字符串。

我懷疑有一種更有效的方法來解決這個問題，但這裏是一個強力解決方案。它生成輸入字符串的子集，按長度排序，降序排列。子集中的元素保留原始字符串的順序。只要count_deletions找到一個有序的子集，它就會將其返回（轉換回字符串）以及刪除次數。因此，它發現的解決方案保證不會比任何其他排序的輸入字符串選擇短。

有關我使用的各種itertools函數的信息，請參閱itertools docs;用於生成子集的算法來自Recipes部分中的powerset示例。

from itertools import chain, combinations 

def count_deletions(s): 
    for t in chain.from_iterable(combinations(s, r) for r in range(len(s), 0, -1)): 
     t = list(t) 
     if t == sorted(t): 
      return ''.join(t), len(s) - len(t) 

# Some test data. 
data = [ 
    "abcdefg", 
    "cba", 
    "abcb", 
    "vwzyx", 
    "zvwzyx", 
    "adabcef", 
    "fantastic", 
] 

for s in data: 
    print(s, count_deletions(s))

輸出

abcdefg ('abcdefg', 0) 
cba ('c', 2) 
abcb ('abc', 1) 
vwzyx ('vwz', 2) 
zvwzyx ('vwz', 3) 
adabcef ('aabcef', 1) 
fantastic ('fntt', 5)

該數據集是不是真的足夠充分的測試算法設計來解決這個問題，但我想這是一個好的起點。 :)

更新

下面是一個Python 3實現由薩爾瓦多·達利的鏈接頁面上提到的算法。這是很多比我以前的蠻力方法更快，特別是對於較長的字符串。

我們可以通過排序字符串的副本，然後找到排序字符串原始字符串的最長公用子序列（LCS）&找到最長的已排序子序列。薩爾瓦多的版本從排序的字符串中刪除重複的元素，因爲他希望結果嚴格增加，但我們並不需要這裏。

此代碼只返回所需的刪除次數，但很容易修改它以返回實際已排序的字符串。

爲了使此遞歸函數更高效，它使用functools的lru_cache裝飾器。

from functools import lru_cache 

@lru_cache(maxsize=None) 
def lcs_len(x, y): 
    if not x or not y: 
     return 0 

    xhead, xtail = x[0], x[1:] 
    yhead, ytail = y[0], y[1:] 
    if xhead == yhead: 
     return 1 + lcs_len(xtail, ytail) 
    return max(lcs_len(x, ytail), lcs_len(xtail, y)) 

def count_deletions(s): 
    lcs_len.cache_clear() 
    return len(s) - lcs_len(s, ''.join(sorted(s))) 

data = [ 
    "abcdefg", 
    "cba", 
    "abcb", 
    "vwzyx", 
    "zvwzyx", 
    "adabcef", 
    "fantastic", 
] 

for s in data: 
    print(s, count_deletions(s))

輸出

abcdefg 0 
cba 2 
abcb 1 
vwzyx 2 
zvwzyx 3 
adabcef 1 
fantastic 5

來源

2016-10-01 14:42:40

希望它能適用於所有情況下:)

s = input() 
s_2 = ''.join(sorted(set(s), key=s.index)) 
sorted_string = sorted(s_2) 
str_to_list = list(s_2) 
dif = 0 

for i in range(len(sorted_string)): 
    if sorted_string[i]!=str_to_list[i]: 
     dif+=1 

print(dif+abs(len(s)-len(s_2)))

來源

2016-10-01 13:51:52 Tuc3k

不，這沒有用。例如，在'zvwzyx'上它返回5，但我們可以從該字符串中產生'vwz'，所以刪除計數只有3。'adabcef'上它應該返回1，而不是4. –

是的，你是對的 – Tuc3k

給定字符串S中的最小字符數應該被刪除以使其成爲排序字符串

回答

相關問題