2016-05-03 28 views
1

我有一個csv文件的數據行,存儲的第一行是標題。例如:基於嵌套條件的字符串過濾行

first line -> [a,b,c,d,e] 

second line -> [0,1,2,1,2] 

third line -> [4,2,4,1,5] 

此外,我有以下列格式中的數據有關的條件的字符串:

條件=(((A = d)或(A = C))和(c < e))

輸出應該只是第3行。我怎樣才能評估這個條件,並分離出所有的嵌套子條件?我正在考慮一個遞歸函數,通過括號讀取,但我有一個混亂在我的代碼:(。謝謝你的答案和對不起我的英語不好!

PS:我不想用熊貓,或csv庫 PS2:上面的條件僅僅是一個例子,可能會有另外一個更多的嵌套條件,如((((a = d)AND(c> e))或(b = c))AND(e < d)) ,或有時,簡單地(α= d)

回答

0

下面是一個快速的解決方案:

from itertools import ifilter 

with open('input.csv', 'r') as fi: 
    lines = ((rawline, map(int, rawline.split(','))) for rawline in fi.readlines()[1:]) 
    results = ifilter(lambda (_, fds): (fds[0] == fds[3] or fds[0] == fds[2]) and (fds[2] < fds[4]), lines) 
    for (rawline, _) in results: 
     print rawline 

隨着input.csv感:

a,b,c,d,e 
0,1,2,1,2 
4,2,4,1,5 

輸出的結果是:

4,2,4,1,5 

更新:較短/緊湊的實現:

from itertools import ifilter 

with open('input.csv', 'r') as fi: 
    results = ifilter(
     lambda fds: (fds[0] == fds[3] or fds[0] == fds[2]) and (fds[2] < fds[4]), 
     (map(int, rawline.split(',')) for rawline in fi.readlines()[1:])) 
    for fields in results: 
     print ','.join(map(str,fields)) 
+0

不錯!但我在找東西一般,我忘記了離子條件只是許多可能性之間的一個例子。 – Zealot

+0

@Zealot,請檢查我的另一個答案 –

0

更好,因爲需求已被更新以創建一個新的答案/改

最快的解決方案n用於評估字符串格式的條件是使用內置函數eval。通過這種方式,你不必做重/買不起解析(lexical analysissyntactic analysis

下面是示例代碼:

from itertools import ifilter 

condition1 = '(((a = d) OR (a = c)) AND (c < e))' 

def evalCondition(condition, *args): 
    ''' 
    1) if you have condition format follow python grammar, then you don't need below replacement 
    2) assume there is no '>=' or '<=', otherwise, you have to use more sophisticated replacement method e.g. using regular exppression 
    ''' 
    condition = condition.replace('=', '==').replace('OR', 'or').replace('AND', 'and') 

    a,b,c,d,e = args 
    return eval(condition) 

with open('input.csv', 'r') as fi: 
    results = ifilter(
     lambda fields: evalCondition(condition1, *fields), 
     (map(int, rawline.split(',')) for rawline in fi.readlines()[1:])) 
    for fields in results: 
     print ','.join(map(str,fields)) 

隨着input.csv之中:

a,b,c,d,e 
0,1,2,1,2 
4,2,4,1,5 

結果輸出爲:

4,2,4,1,5