我生成的自定義XML文件必須採用此腳本的某種格式。它查詢數據庫並將結果變成一個大的XML文件。我這樣做的多個數據庫範圍從庫存零件清單到員工記錄。將MySQL查詢轉換爲XML時的Python編碼錯誤
import csv
import StringIO
import time
import MySQLdb
import lxml.etree
import lxml.builder
from datetime import datetime
import string
from lxml import etree
from lxml.builder import E as buildE
from datetime import datetime
from time import sleep
import shutil
import glob
import os
import logging
def logWrite(message):
logging.basicConfig(
filename="C:\\logs\\XMLSyncOut.log",
level=logging.DEBUG,
format='%(asctime)s %(message)s',
datefmt='%m/%d/%Y %I:%M:%S: %p'
)
logging.debug(message)
def buildTag(tag,parent=None,content=None):
element = buildE(tag)
if content is not None:
element.text = unicode(content)
if parent is not None:
parent.append(element)
return element
def fetchXML(cursor):
logWrite("constructing XML from cursor")
fields = [x[0] for x in cursor.description]
doc = buildTag('DATA')
for record in cursor.fetchall():
r = buildTag('ROW',parent=doc)
for (k,v) in zip(fields,record):
buildTag(k,content=v,parent=r)
return doc
def updateDatabase 1():
try:
conn = MySQLdb.connect(host = 'host',user = 'user',passwd = 'passwd',db = 'database')
cursor = conn.cursor()
except:
sys.exit(1)
logWrite("Cannot connect to database - quitting!")
cursor.execute("SELECT * FROM database.table")
logWrite("Dumping fields from database.table into cursor")
xmlFile = open("results.xml","w")
doc = fetchXML(cursor)
xmlFile.write(etree.tostring(doc,pretty_print=True))
logWrite("Writing XML results.xml")
xmlFile.close()
出於某種原因,新數據庫我從Excel電子表格導入由具有某種類型的編碼錯誤,其他人不具有中的一個。這是錯誤
element.text = unicode(content)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 21: ordinal not in range(128)
我試圖明確改變buildTag功能看起來像這樣編碼爲ASCII:
def buildTag(tag,parent=None,content=None):
element = buildE(tag)
if content is not None:
content = str(content).encode('ascii','ignore')
element.text = content
if parent is not None:
parent.append(element)
return element
這仍然沒有奏效。
關於我能做些什麼來阻止它的任何想法?我無法逃避它們,因爲我不能在記錄中顯示「\ x92」作爲輸出。
你應該爲MySQL設置連接的編碼。執行'SET NAMES'UTF8''(或任何適合你的編碼)。請參閱[手冊](http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html)瞭解更多信息。 –