2011-01-22 37 views
0

我有這個數組:python從這個有序數組中獲取非規格化數組的最佳方式是什麼?

>>> print raw_data 
['LEVEL 1', 
'SUBJECT A', 
'GROUP X', 
'COMMENT i', 
'COMMENT ii', 
'COMMENT iii', 
'GROUP Y', 
'COMMENT iv', 
'COMMENT v', 
'COMMENT vi', 
'LEVEL 2', 
'SUBJECT B', 
'GROUP Z', 
'COMMENT vii', 
'COMMENT viii', 
'COMMENT ix', 
'SUBJECT C', 
'GROUP X2', 
'COMMENT x', 
'COMMENT xi', 
'COMMENT xii', 
'COMMENT xiii', 
'GROUP Y2', 
'COMMENT xiv', 
'COMMENT xv', 
'COMMENT xvi'] 

凡明顯的層次是:

  1. 級別
    1. 主題
      1. 集團
        1. 評論

我的目標是獲得數組作爲非規範化的陣列是一個數據庫中存儲:

>>> print result 
[ 
    ['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT i'], 
    ['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT ii'], 
    ['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT iii'], 
    ['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT iv'], 
    ['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT v'], 
    ['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT vi'], 
    ['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT vi'], 
    ['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT vii'], 
    ['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT viii'], 
    ['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT ix'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP X1', 'COMMENT x'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP X1', 'COMMENT xi'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP X1', 'COMMENT xii'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP X1', 'COMMENT xiii],' 
    ['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xiv'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xv'], 
    ['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xi'] 
] 

我試圖解決這個問題,但我完全迷失了方向,我認爲這個問題應該是平常的,所以我想知道如果有人有一個有效的方法,這似乎是像嵌套設置,但我不知道很多這樣的python,得到的水平很容易,但是我越來越「頭痛」,讓這一切進一步發展。

>>> def addlevel(a): 
    if a.startswith('LEVEL'): 
     return [1, a] 
    elif a.startswith('SUBJECT'): 
     return [2, a] 
    elif a.startswith('GROUP'): 
     return [3, a] 
    elif a.startswith('COMMENT'): 
     return [4, a] 
>>> map(addlevel, raw_data) 
[[1, 'LEVEL 1'], 
[2, 'SUBJECT A'], 
[3, 'GROUP X'], 
[4, 'COMMENT i'], 
[4, 'COMMENT ii'], 
[4, 'COMMENT iii'], 
[3, 'GROUP Y'], 
[4, 'COMMENT iv'], 
[4, 'COMMENT v'], 
[4, 'COMMENT vi'], 
[1, 'LEVEL 2'], 
[2, 'SUBJECT B'], 
[3, 'GROUP Z'], 
[4, 'COMMENT vii'], 
[4, 'COMMENT viii'], 
[4, 'COMMENT ix'], 
[2, 'SUBJECT C'], 
[3, 'GROUP X2'], 
[4, 'COMMENT x'], 
[4, 'COMMENT xi'], 
[4, 'COMMENT xii'], 
[4, 'COMMENT xiii'], 
[3, 'GROUP Y2'], 
[4, 'COMMENT xiv'], 
[4, 'COMMENT xv'], 
[4, 'COMMENT xvi']] 

我將不勝感激任何線索!

回答

3

你可以嘗試這樣的事:

raw_data = [ 'LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT i', 'COMMENT ii', 
'COMMENT iii', 'GROUP Y', 'COMMENT iv', 'COMMENT v', 'COMMENT vi', 'LEVEL 2', 
'SUBJECT B', 'GROUP Z', 'COMMENT vii', 'COMMENT viii', 'COMMENT ix', 
'SUBJECT C', 'GROUP X2', 'COMMENT x', 'COMMENT xi', 'COMMENT xii', 
'COMMENT xiii', 'GROUP Y2', 'COMMENT xiv', 'COMMENT xv', 'COMMENT xvi' ] 

level, subject, group, comment = '', '', '', '' 

result = [] 

for item in raw_data: 

    if item.startswith('COMMENT'): 
     comment = item 
    elif item.startswith('GROUP'): 
     group = item 
     comment = '' 
    elif item.startswith('SUBJECT'): 
     subject = item 
     group = '' 
    elif item.startswith('LEVEL'): 
     level = item 
     subject = '' 

    if level and subject and group and comment: 
     result.append([level, subject, group, comment]) 

import pprint 
pprint.pprint(result) 

這將產生:

[['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT i'], 
['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT ii'], 
['LEVEL 1', 'SUBJECT A', 'GROUP X', 'COMMENT iii'], 
['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT iv'], 
['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT v'], 
['LEVEL 1', 'SUBJECT A', 'GROUP Y', 'COMMENT vi'], 
['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT vii'], 
['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT viii'], 
['LEVEL 2', 'SUBJECT B', 'GROUP Z', 'COMMENT ix'], 
['LEVEL 2', 'SUBJECT C', 'GROUP X2', 'COMMENT x'], 
['LEVEL 2', 'SUBJECT C', 'GROUP X2', 'COMMENT xi'], 
['LEVEL 2', 'SUBJECT C', 'GROUP X2', 'COMMENT xii'], 
['LEVEL 2', 'SUBJECT C', 'GROUP X2', 'COMMENT xiii'], 
['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xiv'], 
['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xv'], 
['LEVEL 2', 'SUBJECT C', 'GROUP Y2', 'COMMENT xvi']] 
+0

簡單!我試圖做一些與嵌套設置有關的事情,但這看起來更簡單 –

+0

錯誤! '['LEVEL 2','SUBJECT A','Y組','VI'],'不應該在結果中。對於找到的每個級別,都需要清除所有子級別。 – PaulMcG

+0

@Paul感謝您發現錯誤(併爲pyparsing,btw;) - 糾正我的帖子。 – miku

5

僞代碼,沒有一個方便的Python解釋器現在:

Set LEVEL, SUBJECT, GROUP to None, results to [] 

Loop over the list 
    if its a 'LEVEL', set LEVEL to it 
    if its a 'SUBJECT', set SUBJECT to it 
    if its a 'GROUP', set GROUP to it 
    if its a "COMMENT", append [LEVEL SUBJECT GROUP and COMMENT] to results 
Ta-da. 

它只是依賴於排序...

+2

應該有成爲Python中的「Ta-da」命令。 :)無論如何,好的片段。 – Donovan

相關問題