2013-02-22 193 views
0

正如標題所暗示的,我有一個關於解析可能具有多個屬性(或根本沒有)的XML標籤的問題,並且我在尋找建議如何實現這一目標;但首先,我認爲有一點背景是爲了。php SimpleXMLelement解析具有多個'潛在'屬性的XML標籤

我的工作稱爲Program O一個基於PHP AIML解釋型的腳本,我從字符串替換功能遷移代碼(例如str_replace函數,preg_replace函數等)的過程中使用PHP的內置SimpleXML函數。到目前爲止,我爲各種AIML標籤創建的幾乎所有解析函數都是完整的,並且工作得非常好,但是其中一個標籤特別踢了我的座位加熱器,這就是CONDITION標籤。

根據AIML tag reference,標籤有三種不同的「形式」:一種同時具有NAME和(VALUE | CONTAINS | EXISTS)屬性,稱爲「多重條件」,一種只具有NAME屬性,稱爲「 「單名列表條件」和稱爲「列表條件」的最終「表單」,它只是CONDITION標記,根本沒有屬性。我之前鏈接到的AIML標記參考有三種形式的例子,但中間有很多單詞,所以我將在這裏重複它們,以及周圍的AIML代碼:

FORM:multi condition標籤:

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition name="gender" value="female"> attractive.</condition> 
    <condition name="gender" value="male"> handsome.</condition> 
    </template> 
</category> 

FORM:列表條件標籤:

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition> 
     <li name="gender" value="female"> attractive.</li> 
     <li name="gender" value="male"> handsome.</li> 
    </condition> 
    </template> 
</category> 

形式:單名稱列表條件標籤

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition name="gender"> 
     <li value="female"> attractive.</li> 
     <li value="male"> handsome.</li> 
    </condition> 
    </template> 
</category> 

在以前版本的劇本是我的工作,只有「列表條件」中使用的條件標籤的形式,雖然這是最常用的形式,它不是專門用於,所以我需要能夠適應其他兩種形式。所以我的問題是:

這是如何以有效的方式完成的?

我已經有工作代碼來解析CONDITION標籤的列表條件形式,並且prelimary測試看起來很有前途,因爲它不會引發錯誤,並且似乎產生了所需的響應(但僅限於列表條件其他兩種形式因錯誤而失敗,原因很明顯)。該功能列出如下:

function parse_condition_tag($convoArr, $element, $parentName, $level) 
{ 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Starting function and setting timestamp.', 2); 
    $response = array(); 
    $attrName = $element['name']; 
    if (!empty ($attrName)) 
    { 
    $attrName = ($attrName == '*') ? $convoArr['star'][1] : $attrName; 
    $search = $convoArr['client_properties'][$attrName]; 
    $path = ($search != 'undefined') ? "//li[@value=\"$search\"]" : '//li[[email protected]*]'; 
    $choice = $element->xpath($path); 
    $children = $choice[0]->children(); 
    if (!empty ($children)) 
    { 
     $response = parseTemplateRecursive($convoArr, $children, $level + 1); 
    } 
    else 
    { 
     $response[] = (string) $choice[0]; 
    } 
    $response_string = implode_recursive(' ', $response, __FILE__, __FUNCTION__, __LINE__); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Returning '$response_string' and exiting function.", 4); 
    return $response_string; 
    } 
    trigger_error('Parsing of the CONDITION tag failed! XML = ' . $element->asXML()); 
} 

我對使用SimpleXML函數還比較陌生,所以我很可能會漏掉一些明顯的東西。事實上,我希望情況正是如此。 :)

編輯:並稱,我終於結束了,如許在我的意見之一,下面的功能:

/* 
    * function parse_condition_tag 
    * Acts as a de-facto if/else structure, selecting a specific output, based on certain criteria 
    * @param [array] $convoArr - The conversation array (a container for a number of necessary variables) 
    * @param [object] $element - The current XML element being parsed 
    * @param [string] $parentName - The parent tag (if applicable) 
    * @param [int] $level   - The current recursion level 
    * @return [string] $response_string 
    */ 

function parse_condition_tag($convoArr, $element, $parentName, $level) 
{ 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Starting function and setting timestamp.', 2); 
    global $error_response; 
    $response = array(); 
    $attrName = $element['name']; 
    $attributes = (array)$element->attributes(); 
    $attributesArray = (isset($attributes['@attributes'])) ? $attributes['@attributes'] : array(); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Element attributes:' . print_r($attributesArray, true), 1); 
    $attribute_count = count($attributesArray); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Element attribute count = $attribute_count", 1); 
    if ($attribute_count == 0) // Bare condition tag 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with no attributes. XML = ' . $element->asXML(), 2); 
    $liNamePath = 'li[@name]'; 
    $condition_xPath = ''; 
    $exclude = array(); 
    $choices = $element->xpath($liNamePath); 
    foreach ($choices as $choice) 
    { 
     $choice_name = (string)$choice['name']; 
     if (in_array($choice_name, $exclude)) continue; 
     $exclude[] = $choice_name; 
     runDebug(__FILE__, __FUNCTION__, __LINE__, 'Client properties = ' . print_r($convoArr['client_properties'], true), 2); 
     $choice_value = get_client_property($convoArr, $choice_name); 
     $condition_xPath .= "li[@name=\"$choice_name\"][@value=\"$choice_value\"]|"; 
    } 
    $condition_xPath .= 'li[not(@*)]'; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "xpath search = $condition_xPath", 4); 
    $pick_search = $element->xpath($condition_xPath); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Pick array = ' . print_r($pick_search, true), 2); 
    $pick_count = count($pick_search); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Pick count = $pick_count.", 2); 
    $pick = $pick_search[0]; 
    } 
    elseif (array_key_exists('value', $attributesArray) or array_key_exists('contains', $attributesArray) or array_key_exists('exists', $attributesArray)) // condition tag with either VALUE, CONTAINS or EXISTS attributes 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with 2 attributes.', 2); 
    $condition_name = (string)$element['name']; 
    $test_value = get_client_property($convoArr, $condition_name); 
    switch (true) 
    { 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     default: 
     runDebug(__FILE__, __FUNCTION__, __LINE__, 'Something went wrong with parsing the CONDITION tag. Returning the error response.', 1); 
     return $error_response; 
    } 
    $pick = ($condition_value == $test_value) ? $element : ''; 
    } 
    elseif (array_key_exists('name', $attributesArray)) // this ~SHOULD~ just trigger if the NAME value is present, and ~NOT~ NAME and (VALUE|CONTAINS|EXISTS) 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with only the NAME attribute.', 2); 
    $condition_name = (string)$element['name']; 
    $test_value = get_client_property($convoArr, $condition_name); 
    $path = "li[@value=\"$test_value\"]|li[not(@*)]"; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "search string = $path", 4); 
    $choice = $element->xpath($path); 
    $pick = $choice[0]; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Found a match. Pick = ' . print_r($choice, true), 4); 
    } 
    else // nothing matches 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'No matches found. Returning default error response.', 1); 
    return $error_response; 
    } 
    $children = (is_object($pick)) ? $pick->children() : null; 
    if (!empty ($children)) 
    { 
    $response = parseTemplateRecursive($convoArr, $children, $level + 1); 
    } 
    else 
    { 
    $response[] = (string) $pick; 
    } 
    $response_string = implode_recursive(' ', $response); 
    return $response_string; 
} 

我懷疑有可能是一個更好的,更優雅的方式做這件事(我的生活的故事,真的),但上面的工作是按照預期進行的。任何有關改進的建議都會受到感謝並予以認真考慮。

+0

這裏請注意,我不是找人來「做功課的我」,而是一個(並非如此)輕輕推動正確的方向。也就是說,代碼示例仍然受歡迎,但不是必需的。 :) – 2013-02-22 01:43:08

回答

0

請注意,我沒有使用SimpleXML,因爲恕我直言,DOMDocument只是太好,waaay更強大。自PHP5以來,DOMDocumentDOMXPath都可用。

我創建了一個簡單的解析器類解析提供的文檔來獲取條件,不同的風格:

class AIMLParser 
{ 
    public function parse($data) 
    { 
     $internalErrors = libxml_use_internal_errors(true); 

     $dom = new DOMDocument(); 
     $dom->loadHTML($data); 
     $xpath = new DOMXPath($dom); 

     $templates = array(); 

     foreach($xpath->query('//template') as $templateNode) { 
      $template = array(
       'text' => $templateNode->firstChild->nodeValue, // note this expects the first child note to always be the textnode 
       'conditions' => array(), 
      ); 

      foreach ($templateNode->getElementsByTagName('condition') as $condition) { 
       if ($condition->hasAttribute('name') && $condition->hasAttribute('value')) { 
        $template['conditions'] = $this->parseConditionsWithoutChildren($template['conditions'], $condition); 
       } elseif ($condition->hasAttribute('name')) { 
        $template['conditions'] = $this->parseConditionsWithNameAttribute($template['conditions'], $condition); 
       } else { 
        $template['conditions'] = $this->parseConditionsWithoutAttributes($template['conditions'], $condition); 
       } 
      } 

      $templates[] = $template; 
     } 

     libxml_use_internal_errors($internalErrors); 

     return $templates; 
    } 

    private function parseConditionsWithoutChildren(array $conditions, DOMNode $condition) 
    { 
     if (!array_key_exists($condition->getAttribute('name'), $conditions)) { 
      $conditions[$condition->getAttribute('name')] = array(); 
     } 

     $conditions[$condition->getAttribute('name')][$condition->getAttribute('value')] = $condition->nodeValue; 

     return $conditions; 
    } 

    private function parseConditionsWithNameAttribute(array $conditions, DOMNode $condition) 
    { 
     if (!array_key_exists($condition->getAttribute('name'), $conditions)) { 
      $conditions[$condition->getAttribute('name')] = array(); 
     } 

     foreach ($condition->getElementsByTagName('li') as $listItem) { 
      $conditions[$condition->getAttribute('name')][$listItem->getAttribute('value')] = $listItem->nodeValue; 
     } 

     return $conditions; 
    } 

    private function parseConditionsWithoutAttributes(array $conditions, DOMNode $condition) 
    { 
     foreach ($condition->getElementsByTagName('li') as $listItem) { 
      if (!array_key_exists($listItem->getAttribute('name'), $conditions)) { 
       $conditions[$listItem->getAttribute('name')] = array(); 
      } 

      $conditions[$listItem->getAttribute('name')][$listItem->getAttribute('value')] = $listItem->nodeValue; 
     } 

     return $conditions; 
    } 
} 

它所做的是它搜索文檔template節點,並通過他們的循環。對於每個template節點,它找出條件是什麼風格。基於它選擇了條件的正確解析函數。循環遍歷所有模板後,它會返回一個解析數組,其中包含您需要的所有信息(我認爲)。

要分析一些文件,你可以這樣做:

$parser = new AIMLParser(); 
$templates = $parser->parse($someVariableWithTheContentOfTheDocument); 

演示:http://codepad.viper-7.com/JPuBaE

+0

ORLY? Downvote? – PeeHaa 2013-02-23 16:25:15

+0

雖然這個函數不適合腳本的其餘部分,但它**確實給了我一個調查的方向,所以+1。如果確實如此,確實會導致我需要的東西,我一定會將其標記爲「答案」。謝謝。 :) – 2013-02-23 16:27:09

+1

雖然我沒有最終使用上面發佈的任何示例代碼,但它足以讓我「向正確的方向推動」,以便將其限定爲答案。我仍在充實代碼,但是當我對它滿意時,我會在這裏發佈它,以便其他人可以受益。再次感謝@PeeHaa。 – 2013-02-24 06:24:11