2014-09-26 138 views
3
Latitude :23.1100348 
Longitude:72.5364922 
date&time :30:August:2014 05:04:31 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 25 dBm 
15000  : 7073  : 23 dBm 
15000  : 6102  : 24 dBm 
15000  : 6101  : 24 dBm 
15000  : 6103  : 17 dBm 

Latitude :23.1120549 
Longitude:72.5397988 
date&time :30:August:2014 05:04:34 PM 
gsm cell id: 4993 
Neighboring List- Lac : Cid : RSSI 
15000  : 7072  : 24 dBm 
15000  : 7073  : 22 dBm 
15000  : 6102  : 23 dBm 
15000  : 6101  : 23 dBm 
15000  : 2552  : 16 dBm 

這是My.txt文件的文件我想將它轉換成XML文件中像如何使用python將.txt文件轉換爲xml文件?

<celldata> 
<time>  </time> 
<latitude> </latitude> 
<longitude> </longitude> 

</celldata> 

我試圖讓所有組件的列表,但我沒有得到Ø/ PI想存儲緯度的所有值,經度,gsm cell id,列表中的時間,這將添加到xml文件中。 我寫下面的代碼。

import re 

pa = 'Longitude|Latitude|gsm cell id|Neighboring List- Lac : Cid : RSSI' 

with open('cell.txt','rw') as file: 
    for line in file: 
     line.strip()  
     if re.search(pa, line): 
      lineInfo = line.split(':') 
      title = lineInfo[0] 
      value = lineInfo[1] 
+0

'time''緯度''經度'是您在xml文件中需要的唯一值?如果不是,請提供完整的xml結構 – nu11p01n73R 2014-09-26 09:37:47

+0

當你說「我試圖製作所有組件的列表,但我沒有得到o/p」。你有嘗試過輸出任何東西嗎?您剛剛顯示了閱讀代碼。 – doctorlove 2014-09-26 09:44:22

回答

3

試試下面的代碼作爲首發:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 

     # If the line contains the wanted data, process it. 
     m = rex.search(line) 
     if m: 
      # Fix some problems with the title as it will be used 
      # as the tag name. 
      title = m.group('title') 
      title = title.replace('&', '') 
      title = title.replace(' ', '') 

      e = ET.SubElement(celldata, title.lower()) 
      e.text = m.group('value') 
      e.tail = '\n' 

# Display for debugging    
ET.dump(root) 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 

它會顯示您的數據。例如:

<root> 
<celldata> 
<latitude>23.1100348</latitude> 
<longitude>72.5364922</longitude> 
<datetime>30:August:2014 05:04:31 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

<celldata> 
<latitude>23.1120549</latitude> 
<longitude>72.5397988</longitude> 
<datetime>30:August:2014 05:04:34 PM</datetime> 
<gsmcellid>4993</gsmcellid> 
</celldata> 

</root> 

更新通緝neigbour列表:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
         |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    for line in f: 
     # Empty line starts new celldata element (hack style, uggly) 
     if line.isspace(): 
      celldata = ET.SubElement(root, 'celldata') 
      celldata.text = '\n' 
      celldata.tail = '\n\n' 
     else: 
      # If the line contains the wanted data, process it. 
      m = rex.search(line) 
      if m: 
       # Fix some problems with the title as it will be used 
       # as the tag name. 
       title = m.group('title') 
       title = title.replace('&', '') 
       title = title.replace(' ', '') 

       if line.startswith('Neighboring'): 
        neighbours = ET.SubElement(celldata, 'neighbours') 
        neighbours.text = '\n' 
        neighbours.tail = '\n' 
       else: 
        e = ET.SubElement(celldata, title.lower()) 
        e.text = m.group('value') 
        e.tail = '\n' 
      else: 
       # This is the neighbour item. Split it by colon, 
       # and set the attributes of the item element. 
       item = ET.SubElement(neighbours, 'item') 
       item.tail = '\n' 

       lac, cid, rssi = (a.strip() for a in line.split(':')) 
       item.attrib['lac'] = lac 
       item.attrib['cid'] = cid 
       item.attrib['rssi'] = rssi.split()[0] # dBm removed 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 
接受鄰居之前的空行10

更新 - 也更好地執行一般用途:

#!python3 

import re 
import xml.etree.ElementTree as ET 

rex = re.compile(r'''(?P<title>Longitude 
         |Latitude 
         |date&time 
         |gsm\s+cell\s+id 
         |Neighboring\s+List-\s+Lac\s+:\s+Cid\s+:\s+RSSI 
        ) 
        \s*:?\s* 
        (?P<value>.*) 
        ''', re.VERBOSE) 

root = ET.Element('root') 
root.text = '\n' # newline before the celldata element 

with open('cell.txt') as f: 
    celldata = ET.SubElement(root, 'celldata') 
    celldata.text = '\n' # newline before the collected element 
    celldata.tail = '\n\n' # empty line after the celldata element 
    status = 0    # init status of the finite automaton 
    for line in f: 
     if status == 0:  # lines of the heading expected 
      # If the line contains the wanted data, process it. 
      m = rex.search(line) 
      if m: 
       # Fix some problems with the title as it will be used 
       # as the tag name. 
       title = m.group('title') 
       title = title.replace('&', '') 
       title = title.replace(' ', '') 

       if line.startswith('Neighboring'): 
        neighbours = ET.SubElement(celldata, 'neighbours') 
        neighbours.text = '\n' 
        neighbours.tail = '\n' 
        status = 1 # empty line and then list of neighbours expected 
       else: 
        e = ET.SubElement(celldata, title.lower()) 
        e.text = m.group('value') 
        e.tail = '\n' 
        # keep the same status 

     elif status == 1: # empty line expected 
      if line.isspace(): 
       status = 2 # list of neighbours must follow 
      else: 
       raise RuntimeError('Empty line expected. (status == {})'.format(status)) 
       status = 999 # error status 

     elif status == 2: # neighbour or the empty line as final separator 

      if line.isspace(): 
       celldata = ET.SubElement(root, 'celldata') 
       celldata.text = '\n' 
       celldata.tail = '\n\n' 
       status = 0 # go to the initial status 
      else: 
       # This is the neighbour item. Split it by colon, 
       # and set the attributes of the item element. 
       item = ET.SubElement(neighbours, 'item') 
       item.tail = '\n' 

       lac, cid, rssi = (a.strip() for a in line.split(':')) 
       item.attrib['lac'] = lac 
       item.attrib['cid'] = cid 
       item.attrib['rssi'] = rssi.split()[0] # dBm removed 
       # keep the same status 

     elif status == 999: # error status -- break the loop 
      break 

     else: 
      raise LogicError('Unexpected status {}.'.format(status)) 
      break 

# Display for debugging 
ET.dump(root) 

# Include the root element to the tree and write the tree 
# to the file. 
tree = ET.ElementTree(root) 
tree.write('cell.xml', encoding='utf-8', xml_declaration=True) 

的代碼實現所謂有限自動機其中status變量表示其當前狀態。你可以使用鉛筆和紙來形象化它 - 用內部的狀態數字繪製小圓圈(在圖論中稱爲節點)。處於這種狀態,你只允許某種輸入(line)。當輸入被識別時,您將箭頭(圖形理論中的定向邊)繪製到另一個狀態(可能爲相同的狀態,如同返回到同一節點的循環)。箭頭被註釋爲`condition |行動'。

結果在開始時可能看起來很複雜;然而,從某種意義上說,您可以隨時關注屬於特定狀態的代碼部分。而且,代碼可以很容易地修改。但是,有限自動機的功率有限。但他們只是完美的這種問題。

+0

我也希望在這個xml文件中的鄰居列表像 cid =',,'rssi =',,,' – yogeshbhimani 2014-09-27 04:20:53

+0

@yogeshbhimani:*「我想學習這種類型的編程,請給我一些鏈接或任何東西。 *這裏的建議既簡單又困難。這取決於你的年齡,以前的教育程度,你想去的地方,你想花多少努力。 – pepr 2014-09-27 13:47:27

+0

我應該警告你,我不認爲我的代碼很好。這是相當匆忙放在一起,需要一些清潔(重構)。 在我看來,堆棧溢出是完美的回答確切的問題;然而,它不適合指導,討論時間演變的問題。對於那種情況,我建議另一個論壇。我親自在專家交流活動(http://www.experts-exchange.com/Programming/Languages/Scripting/Python/)使用同樣的暱稱。讓我們在那裏討論一下。 – pepr 2014-09-27 13:55:32

相關問題