我想解析cpp預處理指令,同時跳過所有其他cpp語法。具體地講,我需要功能區分像和對象像宏:ANTRL4解析器語法測試隱藏通道標記
# define abc(x,y) x##y //function like macro & token pasting operator
# define abc (a,b,c) //object like macro, replace 'abc' by '(a,b,c)'
關鍵的區別是,像宏函數不具有標識符abc
和之間的任何隱藏標記(空白或者多行註釋)它後面的左括號。
但問題是,我已經在詞法分析器中將所有多行註釋和空白刪除到隱藏通道。那麼如何在左括號之前識別空白呢?
的詞法語法我想是這樣的:
CRLF: '\r'? '\n' -> channel(LINEBREAK);
WS: [ \t\f]+ -> channel(WHITESPACE);
ML_COMMENT: '/*' .*? '*/' -> channel(COMMENTS);
SL_COMMENT: '//' ~[\r\n]* -> channel(COMMENTS);
PPHASH: {getCharPositionInLine() == 0}? (ML_COMMENT | [ \t\f])* '#'
; //any line starts with a # as the first char (comments, ws before it skipped)
CHARACTER_LITERAL : 'L'? '\'' (CH_ESC |~['\r\n\\])*? '\'' ; //not exactly 1 char between '' e.g. '0x05'
fragment CH_ESC : '\\' . ;
STRING_LITERAL: 'L'? '"' (STR_ESC | ~["\r\n\\])*? '"' ;
fragment STR_ESC: '\\' . ;
ANY_ID: [_0-9a-zA-Z]+ ;
ALL_SYMBOL:
'~' | '!' | '@' | '#' | '$' | '%' | '^' | '&' | '*' | '=' | '-' | '+' | '\\'| '|' | ':' | ';' | '"' | '\''|
'<' | '>' | '.' | '?' | '/' | ',' | '[' | ']' | '(' | ')' | '{' | '}'
; //basically everything found in a keyboard
我打算用PPHASH令牌告訴解析器預處理指令的開始。它是一條線的開頭的'#'。
的#define所在行我不正確的語法分析器文法:
define_line:
PPHASH 'define' (function_like_define | object_like_define)
;
//--- function like define ---
function_like_define:
ANY_ID '(' parameter_seq? ')' fl_replacement_string
;
parameter_seq: ANY_ID (',' ANY_ID)* ;
//--- object like define ---
object_like_define:
ANY_ID ol_replacement_string
;
//fl&ol different names, visitor no need to test parent. Separate rule to make it a single node supporting getText()
fl_replacement_string: any_non_crlf_token*;
ol_replacement_string: any_non_crlf_token*;
any_non_crlf_token:ANY_ID | .....;
此語法錯誤地將一個#define abc (a,b,c)
像宏功能。如何修復語法?