關於編譯器構造的疑問（Flex/Bison）

我試圖在我的類中構建一個簡單的編譯器，它的第二週完全停留在這些點上：我提供simple.l作爲（flex和bison文件被剪切以節省空間）：關於編譯器構造的疑問（Flex/Bison）

..snip.. 
end  {return(END);} 
skip  {return(SKIP);} 
in  {return(IN);} 
integer {return(INTEGER);} 
let  {return(LET);} 
..snip.. 
[ \t\n\r]+

和simple.y：

%start program 
%token LET IN END 
%token SKIP IF THEN ELSE WHILE DO READ WRITE FI ASSGNOP 
%token NUMBER PERIOD COMMA SEMICOLON INTEGER 
%token IDENTIFIER EWHILE LT 
%left '-' '+' 
%left '*' '/' 
%right '^' 
%% 

program : LET declarations IN commands END SEMICOLON 
declarations : 
|INTEGER id_seq IDENTIFIER PERIOD 
; 
id_seq: 
|id_seq IDENTIFIER COMMA 
; 
commands : 
| commands command SEMICOLON 
; 
command : SKIP 
; 
exp : NUMBER 
| IDENTIFIER 
| '('exp')' 
; 
..snip.. 
%%

我的第一個問題是，當我編譯並執行此，它正確地接受我的輸入，直到結束，但它並沒有結束時停止;即它又來了啓動狀態，是不是應該當它遇到的結束終止：

在輸入：

let 
integer x. 
in 
skip; 
end;

這裏是輸出：

Starting parse 
Entering state 0 
Reading a token: let 
Next token is token LET() 
Shifting token LET() 
Entering state 1 
Reading a token: integer x. 
Next token is token INTEGER() 
Shifting token INTEGER() 
Entering state 3 
Reducing stack by rule 4 (line 22): 
-> $$ = nterm id_seq() 
Stack now 0 1 3 
Entering state 6 
Reading a token: Next token is token IDENTIFIER() 
Shifting token IDENTIFIER() 
Entering state 8 
Reading a token: Next token is token PERIOD() 
Shifting token PERIOD() 
Entering state 10 
Reducing stack by rule 3 (line 20): 
    $1 = token INTEGER() 
    $2 = nterm id_seq() 
    $3 = token IDENTIFIER() 
    $4 = token PERIOD() 
-> $$ = nterm declarations() 
Stack now 0 1 
Entering state 4 
Reading a token: in 
Next token is token IN() 
Shifting token IN() 
Entering state 7 
Reducing stack by rule 6 (line 25): 
-> $$ = nterm commands() 
Stack now 0 1 4 7 
Entering state 9 
Reading a token: skip; 
Next token is token SKIP() 
Shifting token SKIP() 
Entering state 13 
Reducing stack by rule 8 (line 28): 
    $1 = token SKIP() 
-> $$ = nterm command() 
Stack now 0 1 4 7 9 
Entering state 19 
Reading a token: Next token is token SEMICOLON() 
Shifting token SEMICOLON() 
Entering state 29 
Reducing stack by rule 7 (line 26): 
    $1 = nterm commands() 
    $2 = nterm command() 
    $3 = token SEMICOLON() 
-> $$ = nterm commands() 
Stack now 0 1 4 7 
Entering state 9 
Reading a token: end; 
Next token is token END() 
Shifting token END() 
Entering state 12 
Reading a token: Next token is token SEMICOLON() 
Shifting token SEMICOLON() 
Entering state 20 
Reducing stack by rule 1 (line 18): 
    $1 = token LET() 
    $2 = nterm declarations() 
    $3 = token IN() 
    $4 = nterm commands() 
    $5 = token END() 
    $6 = token SEMICOLON() 
-> $$ = nterm program() 
Stack now 0 
Entering state 2 
Reading a token:

爲什麼它準備在我輸入結尾時再次讀取令牌; ??我錯過了什麼？它不應該在這裏結束嗎？如果我現在輸入任何東西它給了我以下錯誤：

Reading a token: let 
Next token is token LET() 
syntax error, unexpected LET, expecting $end 
Error: popping nterm program() 
Stack now 0 
Cleanup: discarding lookahead token LET() 
Stack now 0

我的第二個疑問就是應該在實施這一編譯器的下一步是什麼？我的意思是在這個代碼生成部分之間需要更多步驟？我現在如何實現Symbol表？以及如何讓此解析器接受來自文件的代碼。直到現在我正在終端提供輸入，如果我想從my_program.simple這樣的文件接受代碼，該怎麼辦？謝謝。

來源

2013-03-30 Mojo Jojo

它正在等待（嘗試）讀取EOF（文件結束指示）。給它一個EOF（在終端上點擊ctrl-D），它應該乾淨地退出。或者只是從一個文件重定向，並且該文件將有一個EOF –

非常感謝克里斯。它被懷疑地清除了。 –

declarations : 
|INTEGER id_seq IDENTIFIER PERIOD 
; 
...

我認爲你正在使用錯誤的語法：你指出declarations（以及idseq和commands）可能是epsilon，即一個空的生產。這是因爲|它是alternative運算符。空體和實際模式之間的替代。沒有意義。

我認爲這可能是解析器循環的原因。

對於符號表，你可以使用一個映射（我希望你正在生成C++），在解析器之外聲明爲全局。然後在看到它們時插入符號。

在獲得編譯器之前，可能有用的是有一個工作的解釋器，它更容易，並闡明瞭許多方面將被重用構建編譯器。

來源

2013-03-30 17:28:08 CapelliC

這是多餘的。如果這是錯誤的，那就是說模棱兩可，yacc會這麼說。它不會是解析器循環的原因。 – EJP

關於編譯器構造的疑問（Flex/Bison）

回答

相關問題