這個問題引起了我的興趣,所以我寫了一個過於通用的解決方案。
這裏的一個函數,
- 對準任何數量上的迭代序列
- 作品的,所以它可以有效地處理長(或無限的)序列
- 支持重複的值
- 是與Python兼容2和3(儘管如果我不關心歷史Python版本,我會使用
align_iterables(*inputs, missing_value=None)
)
import itertools
def align_iterables(inputs, missing=None):
"""Align sorted iterables
Yields tuples with values from the respective `inputs`, placing
`missing` if the value does not exist in the corresponding
iterable.
Example: align_generator('bc', 'bf', '', 'abf') yields:
(None, None, None, 'a')
('b', 'b', None, 'b')
('c', None, None, None)
(None, 'f', None, 'f')
"""
End = object()
iterators = [itertools.chain(i, [End]) for i in inputs]
values = [next(i) for i in iterators]
while not all(v is End for v in values):
smallest = min(v for v in values if v is not End)
yield tuple(v if v == smallest else missing for v in values)
values = [next(i) if v == smallest else v
for i, v in zip(iterators, values)]
#對這個問題的問題的適配器:
def align_two_lists(list1, list2, missing="MISSING"):
value = list(zip(*list(align_iterables([list1, list2], missing=missing))))
if not value:
return [[], []]
else:
a, b = value
return [list(a), list(b)]
#A組爲問題的問題測試:
if __name__ == '__main__':
assert align_two_lists('abcef', 'abcdef', '_') == [['a', 'b', 'c', '_', 'e', 'f'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('a', 'abcdef', '_') == [['a', '_', '_', '_', '_', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('abcdef', 'a', '_') == [['a', 'b', 'c', 'd', 'e', 'f'], ['a', '_', '_', '_', '_', '_']]
assert align_two_lists('', 'abcdef', '_') == [['_', '_', '_', '_', '_', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('abcdef', '', '_') == [['a', 'b', 'c', 'd', 'e', 'f'], ['_', '_', '_', '_', '_', '_']]
assert align_two_lists('ace', 'abcdef', '_') == [['a', '_', 'c', '_', 'e', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('bdf', 'ace', '_') == [['_', 'b', '_', 'd', '_', 'f'], ['a', '_', 'c', '_', 'e', '_']]
assert align_two_lists('ace', 'bdf', '_') == [['a', '_', 'c', '_', 'e', '_'], ['_', 'b', '_', 'd', '_', 'f']]
assert align_two_lists('aaacd', 'acd', '_') == [['a', 'a', 'a', 'c', 'd'], ['a', '_', '_', 'c', 'd']]
assert align_two_lists('acd', 'aaacd', '_') == [['a', '_', '_', 'c', 'd'], ['a', 'a', 'a', 'c', 'd']]
assert align_two_lists('', '', '_') == [[], []]
list1 = ["datetimeA", "datetimeB", "datetimeD", "datetimeE"]
list2 = ["datetimeB", "datetimeC", "datetimeD", "datetimeF"]
new_list1 = ["datetimeA", "datetimeB", "MISSING", "datetimeD", "datetimeE", "MISSING"]
new_list2 = ["MISSING", "datetimeB", "datetimeC", "datetimeD", "MISSING", "datetimeF"]
assert align_two_lists(list1, list2) == [new_list1, new_list2]
#和一些額外的測試:
# Also test multiple generators
for expected, got in zip(
[(None, None, None, 'a'),
('b', 'b', None, 'b'),
('c', None, None, None),
(None, 'f', None, 'f')],
align_iterables(['bc', 'bf', '', 'abf'])):
assert expected == got
assert list(align_iterables([])) == []
# And an infinite generator
for expected, got in zip(
[(0, 0),
('X', 1),
(2, 2),
('X', 3),
(4, 4)],
align_iterables([itertools.count(step=2), itertools.count()], missing='X')):
assert expected == got
當你不知道時,你如何確定最終名單的順序ow是日期時間對象的相對順序?即什麼迫使最後兩個元素是「E」然後是「F」?爲什麼不'F'然後'E'? – arshajii
對不起,這可能有點混亂。我看到了你刪除的答案,這是我想要的,但在你刪除它之後投票。我的意思就是說,例如,如果你看看'list1''內的_,那麼絕對順序是:{{A - > A,B - > B,D - > C,E - > D}'。我在問題中提出的兩個列表之間的組合(或相對)編號基本上是在應用解決方案後實現的。它是'(E,F)'而不是'(F,E)',因爲'datetimeE
欲瞭解完整性,請參閱[pandas](http://pandas.pydata.org/pandas-docs/dev/merging.html)。它完全是爲了這種數據操作而編寫的。 –