如何在PyParsing中構造與FollowedBy子類相當的「領先」

我試圖通過使用PyParsing刪除前導或尾隨空白字符來清除一些代碼。去除前導空白是很容易的，因爲我可以使用FollowedBy子類匹配字符串，但不包含它。現在我需要跟隨我的標識字符串的東西。如何在PyParsing中構造與FollowedBy子類相當的「領先」

這裏一個小例子：

from pyparsing import * 

insource = """ 
annotation (Documentation(info=" 
    <html> 
<b>FOO</b> 
</html> 
")); 
""" 
# Working replacement: 
HTMLStartref = OneOrMore(White(' \t\n')) + (FollowedBy(CaselessLiteral('<html>'))) 

## Not working because of non-existing "LeadBy" 
# HTMLEndref = LeadBy(CaselessLiteral('</html>')) + OneOrMore(White(' \t\n')) + FollowedBy('"') 

out = Suppress(HTMLStartref).transformString(insource) 
out2 = Suppress(HTMLEndref).transformString(out)

作爲輸出得到：

>>> print out 
annotation (Documentation(info="<html> 
<b>FOO</b> 
</html> 
"));

和應該得到：

>>> print out2 
annotation (Documentation(info="<html> 
<b>FOO</b> 
</html>"));

我看着documentation但找不到相當於「LeadBy」到FollowedBy或者如何實現這一點。

來源

2012-08-02 Dietmar Winkler

你所要求的東西就像是「lookbehind」，就是說，只有在特定模式之前有東西出現時才匹配。目前我還沒有明確的類，但是對於你想要做的事情，你仍然可以從左到右轉換，只留在領導部分，而不是壓制它，只是壓制空白。

這裏有幾個方法可以解決你的問題：

# define expressions to match leading and trailing 
# html tags, and just suppress the leading or trailing whitespace 
opener = White().suppress() + Literal("<html>") 
closer = Literal("</html>") + White().suppress() 

# define a single expression to match either opener 
# or closer - have to add leaveWhitespace() call so that 
# we catch the leading whitespace in opener 
either = opener|closer 
either.leaveWhitespace() 

print either.transformString(insource) 


# alternative, if you know what the tag will look like: 
# match 'info=<some double quoted string>', and use a parse 
# action to extract the contents within the quoted string, 
# call strip() to remove leading and trailing whitespace, 
# and then restore the original '"' characters (which are 
# auto-stripped by the QuotedString class by default) 
infovalue = QuotedString('"', multiline=True) 
infovalue.setParseAction(lambda t: '"' + t[0].strip() + '"') 
infoattr = "info=" + infovalue 

print infoattr.transformString(insource)

來源

2012-08-02 12:41:19 PaulMcG

感謝保羅！那正是我所期待的。由於更復雜的問題，我會堅持第一個解決方案（雖然我真的很喜歡第二個實現，並試圖記住那個）。 – 2012-08-02 15:06:36

如何在PyParsing中構造與FollowedBy子類相當的「領先」

回答

相關問題