2014-01-08 71 views
3

在我們的應用中,我們允許用戶編寫特定的條件,我們允許他們使用這樣的符號表示的條件:從綴轉換條件公式前綴符號

(1 and 2 and 3 or 4) 

如果每個數字編號對應一個具體的規則/條件。現在的問題是,我應該怎麼把它轉換,使得最後的結果是這樣的:

{ 
    "$or": [ 
     "$and": [1, 2, 3], 
     4 
    ] 
} 

再舉一個例子:

(1 or 2 or 3 and 4) 

要:

{ 
    "$or": [ 
     1, 
     2, 
     "$and": [3, 4] 
    ] 
} 



我h AVE寫50在標記生成器的線,成功地符號化語句爲標記和使用堆棧/ PEEK算法驗證,令牌看起來是這樣的:

["(", "1", "and", "2", "and", "3", "or", "4", ")"] 

現在我應該如何轉換這種「中間符號」的成「前綴記法」與and優先於or的規則?

部分指針或關鍵字非常感謝!現在我所擁有的並不是真正將我引向目前我需要的東西。

一些研究至今:

編輯

此外,用戶可以指定任意數量的括號的能力,如果他們堅持,比如像:

((1 or 3) and (2 or 4) or 5) 

所以得到翻譯爲:

{ 
    "$or": [{ 
     $and": [ 
      "$or": [1, 3], 
      "$or": [2, 4] 
     }, 
     5 
    ] 
} 

編輯2

我想出了算法。 Posted as an answer below。感謝您的幫助!

+1

http://blog.reverberate.org/2013/07/ll-and-lr-parsing -demystified.html – zerkms

+0

謝謝@zerkms!我肯定在研究中需要這個:) –

+0

PS:我認爲你的第二種情況下的AST缺少第二個'$或'標記。它應該是這樣的:http://sketchia.com/draw_G2HXgRv.html PPS:瘋狂的繪畫技巧,我知道) – zerkms

回答

0

感謝指導傢伙們,至少我拿出了我自己的解決方案。由於這是我第一次做數學方程解析,請原諒我,如果我這樣做是錯誤的或無效的,或者幫我找到這個錯誤:

基本上,這裏是我做的步驟,這一點:

  1. 之前解析,總是驗證模式。出現錯誤時拋出錯誤。
  2. 一旦經過驗證,我們會爲前綴符號轉換做一箇中綴表示法。此步驟要求「和」優先於「或」。
    1. 反轉給定的模式
    2. 是否將中綴轉換爲後綴表示法轉換。I dumb, I learn from this
    3. 做反向再次
    4. 中綴到前綴應該在這個階段
  3. 構建從前綴符號,使得
    1. 節點總是有一棵樹來完成,並且最大,有兩個分支機構
    2. 遍歷,直到它達到全葉
  4. 優化樹,使得合併同類運營商一起(如多發性$and運營子$and可以合併形成一個較短的樹)
  5. 混合與設定的給定的標準,並且全部完成!

工作實例可以在這裏找到:http://jsfiddle.net/chaoszcat/uGKYj/3/

工作如下代碼:

(function() { 

    /** 
    * This is a source example of my original question on 
    * http://stackoverflow.com/questions/20986255/converting-conditional-equation-from-infix-to-prefix-notation 
    * 
    * This is my solution and use it at your own risk 
    * @author Lionel Chan <chaoszcat[at]gmail.com> 
    */ 

    /** 
    * isNumeric, from jQuery. Duplicated here to make this js code pure 
    * @param {mix} n Test subject 
    * @returns {boolean} true if it's numeric 
    */ 
    function isNumeric(n) { 
     return !isNaN(parseFloat(n))&&isFinite(n); 
    } 

    /** 
    * Node class - represent a operator or numeric node 
    * @param {string} token The token string, operator "and", "or", or numeric value 
    */ 
    function Node(token) { 
     this.parent = null; 
     this.children = []; //one node has two children at most 
     this.token = token; 
     this.is_operator = token === 'and' || token === 'or'; 
     this.is_numeric = !this.is_operator; 
     this.destroyed = false; 
    } 

    Node.prototype = { 

     isOperator: function() { return this.is_operator;}, 
     isNumeric: function() { return this.is_numeric;}, 

     //While building tree, a node is full if there are two children 
     isFull: function() { 
      return this.children.length >= 2; 
     }, 

     addChild: function(node) { 
      node.parent = this; 
      this.children.push(node); 
     }, 

     hasParent: function() { 
      return this.parent !== null; 
     }, 

     indexOfChild: function(node) { 
      for (var i = 0 ; i < this.children.length ; ++i) { 
       if (this.children[i] === node) { 
        return i; 
       } 
      } 
      return -1; 
     }, 

     removeChild: function(node) { 
      var idx = this.indexOfChild(node); 
      if (idx >= 0) { 
       this.children[idx].parent = null; //remove parent relationship 
       this.children.splice(idx, 1); //splice it out 
      } 
     }, 

     /** 
     * Pass my children to the target node, and destroy myself 
     * 
     * @param {Node} node A target node 
     */ 
     passChildrenTo: function(node) { 
      for (var i = 0 ; i < this.children.length ; ++i) { 
       node.addChild(this.children[i]); 
      } 
      this.destroy(); 
     }, 

     //Destroy this node 
     destroy: function() { 
      this.parent.removeChild(this); 
      this.children = null; 
      this.destroyed = true; 
     } 
    }; 

    /** 
    * Tree class - node manipulation 
    * @param {array} prefixTokens The converted, prefix-notated tokens 
    */ 
    function Tree(prefixTokens) { 
     this.buildTree(prefixTokens); 
     //Optimize tree - so that the tree will merge multiple similar operators together 
     this.optimize(this.root); 
    } 

    Tree.prototype = { 
     root: null, 

     //Reference to the deepest operator node in the tree for next attachment point 
     deepestNode: null, 

     /** 
     * Render this tree with given criteria array 
     * @param {array} crits 
     * @returns {object} The built criteria 
     */ 
     render: function(crits) { 
      //After optimization, we build the criteria and that's all! 
      return this.buildCriteria(this.root, crits); 
     }, 

     /** 
     * Build criteria from root node. Recursive 
     * 
     * @param {Node} node 
     * @param {array} crits 
     * @returns {object} of criteria 
     */ 
     buildCriteria: function(node, crits) { 

      var output = {}, 
       label = '$'+node.token; 

      output[label] = []; //cpnditions array 

      for (var i = 0 ; i < node.children.length ; ++i) { 
       if (node.children[i].isOperator()) { 
        output[label].push(this.buildCriteria(node.children[i], crits)); 
       }else{ 
        output[label].push(crits[node.children[i].token-1]); 
       } 
      } 
      return output; 
     }, 

     /** 
     * Optimize the tree, we can simplify nodes with same operator. Recursive 
     * 
     * @param {Node} node 
     * @void 
     */ 
     optimize: function(node) { 

      //note that node.children.length will keep changing since the swapping children will occur midway. Rescan is required 
      for (var i = 0 ; i < node.children.length ; ++i) { 
       if (node.children[i].isOperator()) { 
        this.optimize(node.children[i]); 
        if (node.children[i].token === node.token) { 
         node.children[i].passChildrenTo(node); 
         i = 0; //rescan this level whenever a swap occured 
        } 
       } 
      } 
     }, 

     /** 
     * Build tree from raw tokens 
     * @param {array} tokens 
     */ 
     buildTree: function(tokens) { 
      for (var i = 0 ; i < tokens.length ; ++i) { 
       this.addNode(new Node(tokens[i])); 
      } 
     }, 

     /** 
     * Add node into tree 
     * 
     * @param {Node} node 
     */ 
     addNode: function(node) { 

      //If no root? The first node is root 
      if (this.root === null) { 
       this.root = node; 
       this.deepestNode = node; 
       return; 
      } 

      //if deepestNode is full, traverse up until we find a node with capacity 
      while(this.deepestNode && this.deepestNode.isFull()) { 
       this.deepestNode = this.deepestNode.parent; 
      } 

      if (this.deepestNode) { 
       this.deepestNode.addChild(node); 
      } 

      //If the current node is an operator, we move the deepestNode cursor to it 
      if (node.isOperator()) { 
       this.deepestNode = node; 
      } 
     } 
    }; 

    /** 
    * Main criteria parser 
    */ 
    var CriteriaParser = { 

     /** 
     * Convert raw string of pattern (1 and 2 or 3) into the object of criteria pattern 
     * 
     * @param {string} str The raw pattern 
     * @param {array} crits The raw list of criteria 
     * @returns {String|Boolean} 
     */ 
     parse: function(str, crits) { 
      var tokens = this.tokenize(str), 
       validationResult = this.validate(tokens, crits), 
       prefixNotation = ''; 

      //Once succeded, we proceed to convert it to prefix notation 
      if (validationResult === true) { 
       prefixNotation = this.infixToPrefix(tokens); 
       return (new Tree(prefixNotation)).render(crits); 
      }else{ 
       return validationResult; 
      } 
     }, 

     /** 
     * Convert the infix notation of the pattern (1 and 2 or 3) into prefix notation "or and 1 2 3" 
     * 
     * Note: 
     * - and has higher precedence than or 
     * 
     * Steps: 
     * 1. Reverse the tokens array 
     * 2. Do infix -> postfix conversion (http://www.cs.arizona.edu/classes/cs227/spring12/infix.pdf, http://scriptasylum.com/tutorials/infix_postfix/algorithms/infix-postfix/index.htm) 
     * 3. Reverse the result 
     * 
     * @param {array} tokens The tokenized tokens 
     * @returns {array} prefix notation of pattern 
     */ 
     infixToPrefix: function(tokens) { 

      var reversedTokens = tokens.slice(0).reverse(), //slice to clone, so not to disturb the original array 
       stack = [], 
       output = []; 

      //And since it's reversed, please regard "(" as closing bracket, and ")" as opening bracket 
      do { 
       var stackTop = stack.length > 0 ? stack[stack.length-1] : null, 
        token = reversedTokens.shift(); 

       if (token === 'and') { 
        while(stackTop === 'and') { 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.push(token); 
        stackTop = token; 
       }else if (token === 'or') { 
        while(stackTop === 'and' || stackTop === 'or') { //and has higher precedence, so it will be popped out 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.push(token); 
        stackTop = token; 
       }else if (token === '(') { //'(' is closing bracket in reversed tokens 
        while(stackTop !== ')' && stackTop !== undefined) { //keep looping until found a "open -)" bracket 
         output.push(stack.pop()); 
         stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
        } 
        stack.pop(); //remove the open ")" bracket 
        stackTop = stack.length > 0 ? stack[stack.length-1] : null; 
       }else if (token === ')') { //')' is opening bracket in reversed tokens 
        stack.push(token); 
       }else if (isNumeric(token)) { 
        output.push(token); 
       }else if (token === undefined) { 
        // no more tokens. Just shift everything out from stack 
        while(stack.length) { 
         stackTop = stack.pop(); 

         if (stackTop !== undefined && stackTop !== ')') { 
          output.push(stackTop); 
         } 
        } 
       } 

      }while(stack.length || reversedTokens.length); 

      //Reverse output and we are done 
      return output.reverse(); 
     }, 

     /** 
     * Tokenized the provided pattern 
     * @param {string} str The raw pattern from user 
     * @returns {array} A tokenized array 
     */ 
     tokenize: function(str) { 
      var pattern = str.replace(/\s/g, ''), //remove all the spaces :) not needed 
       tokens = pattern.split(''), 
       tokenized = []; 

      //Tokenize it and verify 
      var token = null, 
       next = null; 

      //attempts to concatenate the "and" and "or" and numerics 
      while (tokens.length > 0) { 
       token = tokens.shift(); 
       next = tokens.length > 0 ? tokens[0] : null; 

       if (token === '(' || token === ')') { 
        tokenized.push(token); 
       }else if (token === 'a' && tokens.length >= 2 && tokens[0] === 'n' && tokens[1] === 'd') { //and 
        tokenized.push(token + tokens.shift() + tokens.shift()); 
       }else if (token === 'o' && tokens.length >= 1 && next === 'r') { //or 
        tokenized.push(token + tokens.shift()); 
       }else if (isNumeric(token)) { 
        while(isNumeric(next)) { 
         token += next; 
         tokens.shift(); //exhaust it 
         next = tokens.length > 0 ? tokens[0] : null; 
        } 
        tokenized.push(token); 
       }else{ 
        tokenized.push(token); 
       } 
      } 

      return tokenized; 
     }, 

     /** 
     * Attempt to validate tokenized tokens 
     * 
     * @param {array} tokens The tokenized tokens 
     * @param {array} crits The user provided criteria set 
     * @returns {Boolean|String} Returns boolean true if succeeded, string if error occured 
     */ 
     validate: function(tokens, crits) { 

      var valid = true, 
       token = null, 
       stack = [], 
       nextToken = null, 
       criteria_count = crits.length; 

      for (var i = 0 ; i < tokens.length ; ++i) { 

       token = tokens[i]; 
       nextToken = i < tokens.length - 1 ? tokens[i+1] : null; 

       if (token === '(') { 
        stack.push('('); 
        if (!isNumeric(nextToken) && nextToken !== '(' && nextToken !== ')') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (token === ')') { 
        if (stack.length > 0) { 
         stack.pop(); 
        }else{ 
         throw 'Unexpected closing bracket'; 
        } 
        if (nextToken !== ')' && nextToken !== 'and' && nextToken !== 'or' && nextToken !== null) { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (token === 'and' || token === 'or') { 
        if (!isNumeric(nextToken) && nextToken !== '(') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else if (isNumeric(token) && token <= criteria_count) { 
        if (nextToken !== ')' && nextToken !== 'and' && nextToken !== 'or') { 
         throw 'Unexpected token "'+nextToken+'"'; 
        } 
       }else{ 
        //anything not recognized, die. 
        throw 'Unexpected token "'+token+'"'; 
       } 
      } 

      //Last step - check if we have all brackets closed 
      if (valid && stack.length > 0) { 
       throw 'Missing '+stack.length+' closing bracket'; 
      } 

      return valid; 
     } 
    }; 

    //This is an example pattern and criteria set. Note that pattern numbers must match criteria numbers. 
    var pattern = '((1 or 3) and (2 or 4) or 5)', 
     crits = [ 
      1, 2, 3, 4, 5 
     ]; 

    //lazy on the document on load. Just delay 
    setTimeout(function() { 

     var result; 
     try { 
      result = JSON.stringify(CriteriaParser.parse(pattern, crits), undefined, 4); 
     }catch(e) { 
      result = e; 
     } 

     var pre = document.createElement('pre'); 
     pre.innerHTML = result; 
     document.body.appendChild(pre); 
    }, 10); 

})(); 
1

這是最容易使用兩步過程完成的。 1)轉換爲語法樹。 2)將語法樹轉換爲前綴表示法。

語法樹基本上與您的前綴表示法相同,只是使用您的編程語言的數據結構構建而成。

創建語法樹的標準方法是使用LALR分析器生成器,該生成器可用於大多數語言。 LALR解析器速度快,功能強大,表現力強。 LALR解析器生成器將.y文件作爲輸入,並以您選擇的編程語言輸出解析器的源代碼文件。所以你運行一次LALR解析器生成器來生成解析器。

(所有程序員都應該學會使用解析器生成器:)。使用標準標記器也很明智,但我猜測你已經編寫了自己的:))

以下是爲您的迷你語言生成LALR解析器的.y文件。通過LALR解析器生成器運行該.y文件將輸出LALR解析器的源代碼,該解析器將令牌作爲輸入並輸出分析樹(在變量$ root_tree中)。您需要在別處手動定義parsetree_binaryop數據結構。

%left AND. 
%left OR. 
start ::= expr(e). { $root_tree = e; } 
expr(r) ::= expr(e1) AND expr(e2). { r = new parsetree_binaryop(e1, OP_AND, e2); } 
expr(r) ::= expr(e1) OR expr(e2). { r = new parsetree_binaryop(e1, OP_OR, e2); } 
expr(r) ::= LPAR expr(e) RPAR. { r = e; } 

的「%左,」意思是,是左結合(我們可以選擇合適的也不要緊,AND和OR)。在「%left OR」之前提到「%AND AND」意味着AND綁定比OR更緊密,因此生成的解析器會做正確的事情。

當語法樹解析器給你時,生成文本表示很容易。

編輯:這似乎是一個LALR解析器生成器,其在JavaScript輸出解析器:http://sourceforge.net/projects/jscc/

+0

「這也是聰明的使用標準分詞器,而我猜你已經編寫了自己」 ---事情是,js的認真分析區域吸收。當我在尋找時 - 我沒有找到任何像樣的東西。 – zerkms

+0

那麼必須編寫我自己的解析器嗎? :(讓我檢查.. :) –

+0

你的意思是「必須編寫我自己的解析器生成器」?可能你也沒有,http://sourceforge.net/projects/jscc/看起來很有前途。 – Thue

0

首先定義的語義。在你的第一個例子,你給(1 and 2 and 3) or 4解釋,但它也可以是1 and 2 and (3 or 4)這樣:

{ 
    "$and": [ 
     {"$or": [3,4] }, 
     [1,2] 
    ] 
} 

讓我們假設and具有更高的優先級。然後只需通過列表加入與and的所有條款。接下來,所有其他與or加入。

+0

謝謝,但我在這裏不評價方程我已經有現成的代碼在服務器端做評價。我需要的是一個轉換,我想我不知何故得到infix->前綴的想法。好像前綴可以解決問題。一旦完成,我會在這裏發佈解決方案。 –

+0

感謝您的努力。我已經想出瞭解決方案! :)乾杯 –