2011-12-04 43 views
0

我已經搜索過,但仍然沒有線索,請耐心等待。根據布爾真值函數評估字符串

我有字符串,每個字符串對應一個特定的功能矩陣。示例:

'a' = [-vegetable, +fruit, +apple, -orange] 
'o' = [-vegetable, +fruit, -apple, +orange] 
't' = [+vegetable, -fruit, -apple, -orange] 

請注意,這只是我選擇用來表示矩陣的符號。

我想要做的是取任意數量的這樣的字符串,並根據一些真值函數對它們進行評估。因此,評估字符串'aoaot'針對:

[+fruit] => [+apple] 
equivalently: (not [+fruit]) or [+apple] 

應返回此含義爲給定字符串爲false的次數。要麼是這樣的:

[True, False, True, False, True] 

或者評估次數的絕對計數爲False,例如, 2在這裏。在Python中執行此操作的明智方式是什麼?我正在研究NLTK,但我不確定。

+0

除非你想解析用自然語言提出的查詢,否則你不需要NLTK。即使允許任意嵌套,解析布爾表達式對於大多數解析技術來說也是相當簡單的(對於一些解析技術幾乎是微不足道的)。 – delnan

+0

我做。我試圖爲音系約束排名做一個最優性理論的eval()函數。 – Pygmalion

+0

如果't'代表'西紅柿',這實際上是一種水果,至少是植物性的。 http://oxforddictionaries.com/page/tomatofruitveg – PaulMcG

回答

0

您可以使用set類型實現必要的邏輯。

m = { 
    'a':set(['fruit', 'apple']), 
    'o':set(['fruit', 'orange']), 
    't':set(['vegetable']) 
} 

pred = lambda f: ('fruit' in f) <= ('apple' in f) 

# True/False array 
[ pred(m[f]) for f in 'aoaot' ] 

# Number of falses 
sum(not pred(m[f]) for f in 'aoaot') 
+0

究竟在這裏比較什麼? – Pygmalion

+0

'('f'中的'fruit')<=('apple'in f)'如果f沒有結果或者它有一個蘋果,則返回true。 –

0

如果你想有更多的靈活性,在你自己的語法,這裏是數據定義一個簡單的解析器,你已經給:

data = """\ 
a = [-vegetable, +fruit, +apple, -orange, -citrus] 
o = [-vegetable, +fruit, -apple, +orange, +citrus] 
t = [+vegetable, -fruit]""" 

from pyparsing import Word, alphas, oneOf, Group, delimitedList 

# a basic token for a word of alpha characters plus underscores 
ident = Word(alphas + '_') 

# define a token for leading '+' or '-', with parse action to convert to bool value 
inclFlag = oneOf('+ -') 
inclFlag.setParseAction(lambda t: t[0] == '+') 

# define a feature as the combination of an inclFlag and a feature name 
feature = Group(inclFlag('has') + ident('feature')) 

# define a definition 
defn = ident('name') + '=' + '[' + delimitedList(feature)('features') + ']' 

# search through the input test data for defns, and print out the parsed data 
# by name, and the associated features 
defns = defn.searchString(data) 
for d in defns: 
    print d.dump() 
    for f in d.features: 
     print f.dump(' ') 
    print 

打印:

['a', '=', '[', [False, 'vegetable'], [True, 'fruit'], [True, 'apple'], [False, 'orange'], [False, 'citrus'], ']'] 
- features: [[False, 'vegetable'], [True, 'fruit'], [True, 'apple'], [False, 'orange'], [False, 'citrus']] 
- name: a 
    [False, 'vegetable'] 
    - feature: vegetable 
    - has: False 
    [True, 'fruit'] 
    - feature: fruit 
    - has: True 
    [True, 'apple'] 
    - feature: apple 
    - has: True 
    [False, 'orange'] 
    - feature: orange 
    - has: False 
    [False, 'citrus'] 
    - feature: citrus 
    - has: False 

['o', '=', '[', [False, 'vegetable'], [True, 'fruit'], [False, 'apple'], [True, 'orange'], [True, 'citrus'], ']'] 
- features: [[False, 'vegetable'], [True, 'fruit'], [False, 'apple'], [True, 'orange'], [True, 'citrus']] 
- name: o 
    [False, 'vegetable'] 
    - feature: vegetable 
    - has: False 
    [True, 'fruit'] 
    - feature: fruit 
    - has: True 
    [False, 'apple'] 
    - feature: apple 
    - has: False 
    [True, 'orange'] 
    - feature: orange 
    - has: True 
    [True, 'citrus'] 
    - feature: citrus 
    - has: True 

['t', '=', '[', [True, 'vegetable'], [False, 'fruit'], ']'] 
- features: [[True, 'vegetable'], [False, 'fruit']] 
- name: t 
    [True, 'vegetable'] 
    - feature: vegetable 
    - has: True 
    [False, 'fruit'] 
    - feature: fruit 
    - has: False 

Pyparsing爲你做了很多開銷,比如迭代輸入字符串,跳過不相關的空白,並且使用命名屬性返回解析的數據。查看pyparsing wiki(SimpleBool.py)中的布爾評估器,或更完整的布爾評估器軟件包booleano。