2013-08-06 73 views
5

老實說,我不知道這是什麼標記(我想知道這個標記是否有名字)。什麼是最簡單的方法來解析這樣的結構?我有很多在txt文件中。解析類似JSON的標記

unlockType BirthdayCake { 
     // Don't delete 
    commonName  "Birthday Cake" 
    autoTag 
    category  Item 
    path   models/ 
    timedExclusive 1 
    descSymbol  BirthdayCakeDesc 
    dispSymbol  BirthdayCakeDisp 
    flairCfg  "Cake/Idle.aaf mat_BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /followDist 3.0 /moveSlew 0.0666 /moveVelThresh 10.0 /animEaseTime 1.0 /zOffRobot 2.7 /rotX 20.0 /zSpinDef bone_spinA 80.0" 
    //OnInspectOrUnlock Menus previewInit Cake/Idle.aaf BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /rotSpeed 30 /pos 3.5 0 1 /scaleMult 1.0 
    property  BirthdaySpirit   10 
} 
+0

你可以嘗試爆炸的標籤,不是很有效,雖然:( –

+0

看起來像是由製表符/空格隔開,和條目由換行符劃分是它只是數字和沒有空格的字符串,這將是好的問題將是帶引號的字符串,有點棘手與正則表達式匹配.. – MightyPork

+9

爲什麼人們堅持創建自己的非標準標記格式,如果有很多好的標準已經存在?值得知道它來自哪裏,因爲它可能有一個完美的解析器,它看起來很像它可能是程序代碼,但我不認識這種語言。我真的很好奇你從哪裏得到它 - 人們如何向你提供它不能告訴你它是什麼? – Spudley

回答

3

這裏是我的解決方案:

你必須送入它只有{}標籤的內部,你會得到尊重字符串引號等

不像其他的答案數組,它照顧"text with quotes"並接受空格,製表符和混合格式。

$a = <<<EOT 
    commonName  "Birthday Cake" 
    autoTag 
    category  Item 
    path   models/ 
    timedExclusive 1 
    descSymbol  BirthdayCakeDesc 
    dispSymbol  BirthdayCakeDisp 
    flairCfg  "Cake/Idle.aaf mat_BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /followDist 3.0 /moveSlew 0.0666 /moveVelThresh 10.0 /animEaseTime 1.0 /zOffRobot 2.7 /rotX 20.0 /zSpinDef bone_spinA 80.0" 
    //OnInspectOrUnlock Menus previewInit Cake/Idle.aaf BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /rotSpeed 30 /pos 3.5 0 1 /scaleMult 1.0 
    property  BirthdaySpirit   10 
EOT; 

$lines = explode("\n", $a); 

$parsed = array(); 

foreach($lines as $line) { 
    $chars = str_split($line); 

    $quoteOpen = false; 

    $lastField = ""; 

    $lineFields = array(); 

    foreach($chars as $c) { 
     if($c == '"') { 
      if($quoteOpen) { 
       $quoteOpen = false; 
       $lineFields[] = $lastField; 
      } else { 
       $quoteOpen = true; 
      } 
      $lastField = ""; 
      continue; 
     } 

     if(preg_match("/\\s/", $c) === 0) { 
      $lastField .= $c; 
     } else { 
      if($lastField != "" && !$quoteOpen) { 
       $lineFields[] = $lastField; 
       $lastField = ""; 
      } else { 
       if(!$lastField == "") { 
        $lastField .= $c; 
       } 
      } 
     } 
    } 

    if($lastField != "") { 
     $lineFields[] = $lastField; 
    } 

    $parsed[] = $lineFields; 
} 

echo '<pre>'; print_r($parsed); echo '</pre>'; 

輸出:

Array 
(
    [0] => Array 
     (
      [0] => commonName 
      [1] => Birthday Cake 
     ) 

    [1] => Array 
     (
      [0] => autoTag 
     ) 

    [2] => Array 
     (
      [0] => category 
      [1] => Item 
     ) 

    [3] => Array 
     (
      [0] => path 
      [1] => models/ 
     ) 

    [4] => Array 
     (
      [0] => timedExclusive 
      [1] => 1 
     ) 

    [5] => Array 
     (
      [0] => descSymbol 
      [1] => BirthdayCakeDesc 
     ) 

    [6] => Array 
     (
      [0] => dispSymbol 
      [1] => BirthdayCakeDisp 
     ) 

    [7] => Array 
     (
      [0] => flairCfg 
      [1] => Cake/Idle.aaf mat_BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /followDist 3.0 /moveSlew 0.0666 /moveVelThresh 10.0 /animEaseTime 1.0 /zOffRobot 2.7 /rotX 20.0 /zSpinDef bone_spinA 80.0 
     ) 

    [8] => Array 
     (
      [0] => //OnInspectOrUnlock 
      [1] => Menus 
      [2] => previewInit 
      [3] => Cake/Idle.aaf 
      [4] => BirthdayCake< 
      [5] => /scale 
      [6] => 1.5 
      [7] => /animation 
      [8] => Cake/Idle.aaf 
      [9] => 0 
      [10] => looping 
      [11] => 0.8 
      [12] => /rotSpeed 
      [13] => 30 
      [14] => /pos 
      [15] => 3.5 
      [16] => 0 
      [17] => 1 
      [18] => /scaleMult 
      [19] => 1.0 
     ) 

    [9] => Array 
     (
      [0] => property 
      [1] => BirthdaySpirit 
      [2] => 10 
     ) 

) 
+0

好像您的解決方案最好(如果我自己切斷評論),謝謝! – Fedcomp

+0

你可以檢查一個數組中的第一個元素是以//開頭,如果是,忽略它。 – MightyPork

+0

即使在我問這個問題(並且從你的回覆中發出函數)之前,我已經評論了清理/修剪功能。無論如何,謝謝!你節省了我的時間。 – Fedcomp

5
$str = "unlockType BirthdayCake { 
     // Don't delete 
    commonName  \"Birthday Cake\" 
    autoTag 
    category  Item 
    path   models/ 
    timedExclusive 1 
    descSymbol  BirthdayCakeDesc 
    dispSymbol  BirthdayCakeDisp 
    flairCfg  \"Cake/Idle.aaf mat_BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /followDist 3.0 /moveSlew 0.0666 /moveVelThresh 10.0 /animEaseTime 1.0 /zOffRobot 2.7 /rotX 20.0 /zSpinDef bone_spinA 80.0\" 
    //OnInspectOrUnlock Menus previewInit Cake/Idle.aaf BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /rotSpeed 30 /pos 3.5 0 1 /scaleMult 1.0 
    property  BirthdaySpirit   10 
} 

unlockType PetFish3 { 
     commonName    \"Lionfish\" 
     autoTag 
     category    Pet 
     path     flair/ 
     descSymbol    PetFish3Desc 
     dispSymbol    PetFish3Disp 
     flairCfg    \"pet flair/PetFishes/PetFish3.amf mat_PetFishes< /scale 1.1 /animation flair/PetFishes/idle3.aaf 0 looping 0.45 /moveAnim flair/PetFishes/fly1.aaf 1 looping 1.62 /followDist 3.0 /moveSlew 0.045 /moveVelThresh 8.0 /animEaseTime 0.45 /zOffRobot 2.6 /rotX 15.0 /moveSlew 0.05 /turnToMove 230\" 
} 
"; 

function parseThis($text) 
{ 
    $types = array(); 
    preg_match_all('#(unlockType [^\{]+{.+?\n\s*})#s',$text,$matches); 
    foreach($matches[1] as $str) 
    { 
     $typeName = preg_replace('#^[^ ]+ ([^ ]+).*#s','$1',$str); 
     $contents = preg_split('#(\r?\n)+#',$str); 
     $contents = array_map('trim',$contents); 
     array_pop($contents); 
     array_shift($contents); 
     $data = array(); 
     foreach($contents as $line) 
     { 
      if(substr($line,0,2)=='//') continue; 
      $parts = preg_split("#(\t+|\s{3,})#",$line); 
      $title = array_shift($parts); 
      $partC = count($parts); 
      $data[$title] = $partC==1 ? $parts[0] : ($partC==0 ? '' : $parts); 
     } 
     $types[$typeName] = $data; 
    } 
    return $types; 
} 
$types = parseThis($str); 
echo '<pre>'.print_r($types,true).'</pre>'; 

輸出:

Array 
(
    [BirthdayCake] => Array 
     (
      [commonName] => "Birthday Cake" 
      [autoTag] => 
      [category] => Item 
      [path] => models/ 
      [timedExclusive] => 1 
      [descSymbol] => BirthdayCakeDesc 
      [dispSymbol] => BirthdayCakeDisp 
      [flairCfg] => "Cake/Idle.aaf mat_BirthdayCake< /scale 1.5 /animation Cake/Idle.aaf 0 looping 0.8 /followDist 3.0 /moveSlew 0.0666 /moveVelThresh 10.0 /animEaseTime 1.0 /zOffRobot 2.7 /rotX 20.0 /zSpinDef bone_spinA 80.0" 
      [property] => Array 
       (
        [0] => BirthdaySpirit 
        [1] => 10 
       ) 

     ) 

    [PetFish3] => Array 
     (
      [commonName] => "Lionfish" 
      [autoTag] => 
      [category] => Pet 
      [path] => flair/ 
      [descSymbol] => PetFish3Desc 
      [dispSymbol] => PetFish3Disp 
      [flairCfg] => "pet flair/PetFishes/PetFish3.amf mat_PetFishes< /scale 1.1 /animation flair/PetFishes/idle3.aaf 0 looping 0.45 /moveAnim flair/PetFishes/fly1.aaf 1 looping 1.62 /followDist 3.0 /moveSlew 0.045 /moveVelThresh 8.0 /animEaseTime 0.45 /zOffRobot 2.6 /rotX 15.0 /moveSlew 0.05 /turnToMove 230" 
     ) 

) 

粗糙說明

  • 使用preg_match_all找到每個塊(unlockType someRandomText {....})
  • 遍歷每個結果preg_match_all(各塊)來解析塊單獨
    • 拆分的{..}的由換行的內容,然後每個結果映射到trim()以除去任何前導和結尾間隔/製表符
      • 分割每條線由3米或更多的空間(如適當的標籤似乎並沒有被使用到)
      • 使用的第一個結果的分裂是我們陣中的關鍵,那麼分裂的其餘部分推入值
+0

我會使'unlockType'匹配器成爲一個不可知類型的聲明標記語法分析器,爲了防萬一,還要用'\ s +'替換'\ t'。 – Flosculus

+2

很高興你解決了他的問題,但我不明白這對社區有多大幫助。你應該考慮解釋你在解決這個問題時使用了什麼策略,爲什麼。 –

+0

呃..好吧,看起來你的答案既快又好。嗯,我試過atleast .. – MightyPork