2015-04-17 101 views
2

我剛剛學習flex.I已經編寫了一個簡單的程序來檢查給定文本文件的單詞是否是動詞並打印它們。我想檢測輸入文件中是否有單行或多行註釋(如c和C++樣式註釋)並打印整個註釋以輸出。有沒有辦法做到這一點?我的示例代碼如下:在flex中檢測註釋

%% 

[\t]+ 

is | 

am | 

are | 

was | 

were {printf("%s: is a verb",yytext);} 

[a-zA-Z]+ {printf("%s: is a verb",yytext);} 

. |\n 

%% 

int main(int argc, char *argv[]){  
    yyin = fopen(argv[1], "r");  
    yylex();   
    fclose(yyin); 
} 

回答

2

這有點複雜。我建議使用start conditions來處理評論。這裏有一個詞法分析器我趕緊把在一起的:

%option noyywrap 
%x COMMENT_SINGLE 
%x COMMENT_MULTI 

%top{ 
/* for strndup */ 
#include <string.h> 
} 

%{ 
char* commentStart; 
%} 

%% 

[\n\t\r ]+ { 
    /* ignore whitespace */ } 

<INITIAL>"//" { 
    /* begin of single-line comment */ 
    commentStart = yytext; 
    BEGIN(COMMENT_SINGLE); 
} 

<COMMENT_SINGLE>\n { 
    /* end of single-line comment */ 
    char* comment = strndup(commentStart, yytext - commentStart); 
    printf("'%s': was a single-line comment\n", comment); 
    free(comment); 
    BEGIN(INITIAL); 
} 

<COMMENT_SINGLE>[^\n]+ { 
    /* suppress whatever is in the comment */ 
} 

<INITIAL>"/*" { 
    /* begin of multi-line comment */ 
    commentStart = yytext; 
    BEGIN(COMMENT_MULTI); 
} 

<COMMENT_MULTI>"*/" { 
    /* end of multi-line comment */ 
    char* comment = strndup(commentStart, yytext + 2 - commentStart); 
    printf("'%s': was a multi-line comment\n", comment); 
    free(comment); 
    BEGIN(INITIAL); 
} 

<COMMENT_MULTI>. { 
    /* suppress whatever is in the comment */ 
} 

<COMMENT_MULTI>\n { 
    /* don't print newlines */ 
} 

is | 
am | 
are | 
was | 
were { 
    printf("'%s': is a verb\n", yytext); 
} 

[a-zA-Z]+ { 
    printf("'%s': is not a verb\n", yytext); 
} 

. { 
    /* don't print everything else */ 
} 

%% 

int main(int argc, char *argv[]){  
    yyin = fopen(argv[1], "r");  
    yylex();   
    fclose(yyin); 
} 

注:詞法分析器代碼已經足夠長的時間,所以我ommitted任何錯誤檢查。

+0

代碼ID檢測的多線comment.single行註釋每個註釋不被處理individually.Additionally有沒有辦法刪除/ *和* /輸出? @stj – SKB

+0

每當我嘗試它(使用flex 2.5.39和默認選項生成),上面的代碼將單獨處理單行和多行註釋。你可以提供對待它們的輸入嗎? – stj

+0

要從多行註釋中刪除'/ *'和'* /',你需要調整這一行:'char * comment = strndup(commentStart,yytext + 2 - commentStart);'看起來像這樣:'' char * comment = strndup(commentStart + 2,yytext - 2 - commentStart);' – stj