解析基於語法的文件格式的好python策略

我已經爲PLY和OBJ這樣的3D文件格式編寫了相當多的簡單導入程序，它們似乎有一個非常基於狀態的per-line結構，使得解析非常簡單。我的朋友希望我使用python爲mirai的文件類型實現一個簡單的導入器，並且我注意到可以有很多分層表示的數據，這與我之前使用的更簡單的逐行格式不同。解析基於語法的文件格式的好python策略

我想知道如果我應該嘗試使用一些Python庫，一些複雜的正則表達式來創建一個完整的語法，或者我應該使用字符串替換破解一些解決方案。任何人都可以提供任何解析此類文件的好建議嗎？這個特定的例子是一個導出的立方體

filetype gx; 
GrammarVersion 2.1.0.0; 
TemplateVersion 2.1.0.0; 
HostName "ZOO-HP"; 
UserName "Phil"; 
TimeStamp "Mon 20-Aug-12, 9:48 pm"; 
OSName "Windows NT 6.01.7601"; 
ApplicationName "Mirai"; 
ApplicationVersion "1.1.0.1 5629"; 
include "gbf-2-1-0-0.tpl"; 
include "cube_mirai.gmf"; 


body Polyhedron-31 (

    vertices[] < (
coord -0.500000 -0.500000 0.500000 ; 
) 
(
coord -0.500000 0.500000 0.500000 ; 
) 
(
coord 0.500000 0.500000 0.500000 ; 
) 
(
coord 0.500000 -0.500000 0.500000 ; 
) 
(
coord 0.500000 -0.500000 -0.500000 ; 
) 
(
coord 0.500000 0.500000 -0.500000 ; 
) 
(
coord -0.500000 0.500000 -0.500000 ; 
) 
(
coord -0.500000 -0.500000 -0.500000 ; 
) 
> 
    faces[] < (
normal 0.000000 0.000000 1.00000 ; 
     vertex-indices[] <0;1;2;3;> 
     vertex-normal-indices[] <0;1;2;3;>) 
(
normal 0.000000 0.000000 -1.00000 ; 
     vertex-indices[] <4;5;6;7;> 
     vertex-normal-indices[] <4;5;6;7;>) 
(
normal 0.000000 1.00000 0.000000 ; 
     vertex-indices[] <1;6;5;2;> 
     vertex-normal-indices[] <1;6;5;2;>) 
(
normal 0.000000 -1.00000 0.000000 ; 
     vertex-indices[] <7;0;3;4;> 
     vertex-normal-indices[] <7;0;3;4;>) 
(
normal 1.00000 0.000000 0.000000 ; 
     vertex-indices[] <3;2;5;4;> 
     vertex-normal-indices[] <3;2;5;4;>) 
(
normal -1.00000 0.000000 0.000000 ; 
     vertex-indices[] <7;6;1;0;> 
     vertex-normal-indices[] <7;6;1;0;>) 
> 
    normals[] <-0.577350 -0.577350 0.577350 ; 
-0.577350 0.577350 0.577350 ; 
0.577350 0.577350 0.577350 ; 
0.577350 -0.577350 0.577350 ; 
0.577350 -0.577350 -0.577350 ; 
0.577350 0.577350 -0.577350 ; 
-0.577350 0.577350 -0.577350 ; 
-0.577350 -0.577350 -0.577350 ; 
> 
)

來源

2013-01-18 voodoogiant

目前還不清楚如何破解'字符串替換'在這裏是相關的，它表明與使用正則表達式解析文本的最初問題不同的問題。 –

[Python解析工具]（http：// nedbatchelder。 COM /文本/蟒蛇，parsers.html） – jfs

爲了解析這樣大的構建體我將避免手工編寫複雜的正則表達式;他們將變得太昂貴維護/調試。

我會看看PyParsing，它的範圍很大，範圍爲examples或PLY。

其中任何一個都可以讓您以更有序的方式解析文件，這應該更易於維護。它們也將更容易擴展到簡單的多維數據集示例以涵蓋mirai文件格式的全部範圍。

來源

2013-01-18 03:07:39

解析基於語法的文件格式的好python策略

回答

相關問題