2016-10-21 175 views
-4

我要找的示例代碼檢索乾淨的輸出下面提到HTML表格代碼..蟒蛇HTML表格數據解析

<td width="40%" valign="top" colSpan="1" style="padding-left:5px;padding-top:4px"><input type="text" id="subscriberDeletionForm_OSILA_DisplayName" name="OSILA_DisplayName" class="roTextField" style="width:300px;" value="Rashmi HK" readonly="readonly" tabindex="-1"/> 
</td> 
<td colspan="1">&nbsp;</td> 
</tr> 
<tr> 
<td class="firstLabelInRowCell"><label id="subscriberDeletionForm_OSILA_CountryCode_label" for="subscriberDeletionForm_OSILA_CountryCode" class="rwLabel">Country Code:</label> 
</td> 
<td width="40%" valign="top" colSpan="1" style="padding-left:5px;padding-top:4px"><input type="text" id="subscriberDeletionForm_OSILA_CountryCode" name="OSILA_CountryCode" class="roTextField" style="width:300px;" value="91" readonly="readonly" tabindex="-1"/> 
</td> 
<td class="labelCell"><label id="subscriberDeletionForm_OSILA_AreaCode_label" for="subscriberDeletionForm_OSILA_AreaCode" class="rwLabel">Area Code:</label> 
</td> 
<td width="40%" valign="top" colSpan="1" style="padding-left:5px;padding-top:4px"><input type="text" id="subscriberDeletionForm_OSILA_AreaCode" name="OSILA_AreaCode" class="roTextField" style="width:300px;" value="80" readonly="readonly" tabindex="-1"/> 

輸出我需要的。

OSILA_DisplayName = Rashmi HK 
OSILA_CountryCode = 91                    OSILA_AreaCode  = 80 

我正在使用下面的代碼,並能夠檢索它。但我需要提取大量的領域以同樣的方式,因此我正在尋找不同的方法來提取

OSILA_DisplayName = 'id="subscriberDeletionForm_OSILA_DisplayName"' 
    f22 = open('delsubinfo1', 'r') 
    for line2 in f22: 
     if OSILA_DisplayName in line2: 
#   print line2 
      line2 = line2.split('"') 
#   print line2 
      OSILA_DisplayName1 = line2[19].strip() 
      print OSILA_DisplayName1 

    OSILA_CountryCode = 'name="OSILA_CountryCode"' 
    f23 = open('delsubinfo1', 'r') 
    for line3 in f23: 
     if OSILA_CountryCode in line3: 
#   print line3 
      line3 = line3.split('"') 
#   print line3 
      OSILA_CountryCode1 = line3[19].strip() 
      print OSILA_CountryCode1 
+1

你的代碼在哪裏? –

+1

你有什麼嘗試嗎? –

+0

更新了代碼 – mikegray

回答

0

我建議使用BeautifulSoup解析HTML文本。

您可以像這樣從html代碼中檢索數據。

from bs4 import BeautifulSoup 

txt = open('delsubinfo1', 'r').read() 
soup = BeautifulSoup(txt, "html.parser") 
for input_tag in soup.find_all("input"): 
    if input_tag.get("name") in ('OSILA_DisplayName', 'OSILA_CountryCode'): 
     print input_tag.get("name").ljust(18), '=', input_tag.get("value") 
# OSILA_DisplayName = Rashmi HK 
# OSILA_CountryCode = 91 
+0

uehara,我們可以通過名稱和值作爲變量進一步處理 – mikegray

+0

是的,你可以。你可以在這裏獲得更多信息(https://www.crummy.com/software/BeautifulSoup/bs4/doc/)。 – uehara1414