2017-02-15 27 views
0

我在我的產品網站中有html頁面,我想要解析文檔並從html頁面獲取產品版本。解析並greiler中的html

html頁面應該是這樣的:

<html> 
....... 
....... 
<body> 
....... 
....... 
<div id='version_info'> 
    <div class="product-version"> 
     <div class="product-title">Name of the product 1:</div><div class="product-value">ver_123</div> 
    </div> 
    <div class="product-version"> 
     <div class="product-title">Name of the product 2:</div><div class="product-value">ver_456</div> 
    </div> 
    <div class="product-version"> 
     <div class="product-title">Name of the product 3:</div><div class="product-value">ver_845</div> 
    </div> 
    <div class="product-version"> 
     <div class="product-title">Name of the product 4:</div><div class="product-value">ver_146</div> 
    </div> 
</div> 
....... 
....... 
</body> 
....... 
....... 
</html> 

我怎麼可以grep文檔和表格的字符串這樣的事? productname1 = ver_123,productname2 = ver_456,productname3 = ver_845等

+0

你需要回答的HTML這種特定的形式?或者它可以不同? –

+0

如果我爲這個HTML獲得答案,那將會很好。但是如果你有類似的例子,那也會有很大的幫助。 –

+0

grepping xml/html,現在你有兩個問題。 – tedder42

回答

1

我已經在這個特殊的HTML文件的工作,並在結果我未滿可變result

注獲得所需的變量的字典:

1.請改變手冊中html文件的路徑。

2.這個特定的手冊適用於這個HTML例子。爲了進一步的要求和改進提供HTML。

--- 
- hosts: localhost 
    name: "Getting varibles from HTML" 
    vars: 
    result: {} 
    tasks: 
    - name: "Getting content of the file" 
    command: cat /path/to/html/file 
    register: search 
    - name: "Creating dictionary while Looping over file" 
    ignore_errors: true 
    vars: 
    key: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('<div.*','') | regex_replace('^\\s*','')}}" 
    value: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('^[\\w\\s\\:]*','') | replace('<div class=\"product-value\">','') | regex_replace('\\s*$','')}}" 
    set_fact: 
    result: "{{ result | combine({ key: value }) }}" 
    when: "'product-title' in item" 
    with_items: "{{search.stdout_lines}}" 

    - name: "Getting register" 
    debug: 
    msg: "{{result}}" 
... 

輸出

ok: [localhost] => { 
    "msg": { 
     "Name of the product 1:": "ver_123", 
     "Name of the product 2:": "ver_456", 
     "Name of the product 3:": "ver_845", 
     "Name of the product 4:": "ver_146" 
    } 
} 
+1

謝謝。我會檢查你的代碼今天,讓你知道:) –

+0

歡迎..... :) –

+1

@SRNathan我看到它已經差不多7個月,因爲這篇文章,但它看起來像Sahil解決了你的問題;你應該接受它。如果你自己解決了問題,並且他的答案有所幫助,請考慮接受它,並在OP中提供解決方案作爲編輯。 –