2017-05-06 53 views
1

我特別使用python和pyodbc模塊在Hadoop上執行Hive查詢。代碼觸發問題的部分是這樣的:Drop Table語句中的Hive ParseException

import pyodbc 
import pandas 

oConnexionString = 'Driver={ClouderaHive};[...]' 
oConnexion = pyodbc.connect(oConnexionString, autocommit=True) 
oConnexion.setencoding(encoding='utf-8') 
oQueryParameter = "select * from my_db.my_table;" 
oParameterData = pandas.read_sql(oQueryParameter, oConnexion) 
oCursor = oConnexion.cursor() 

for oRow in oParameterData.index: 
    sTableName = oParameterData.loc[oRow,'TableName'] 
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';' 
    print(oQueryDeleteTable) 
    oCursor.execute(oQueryDeleteTable) 

打印給出了這樣的:drop table if exists dl_audit_data_quality.hero_context_start_gamemode;

cursor.execute觸發以下錯誤消息

pyodbc.Error: ('HY000', "[HY000] [Cloudera][HiveODBC] (80) Syntax or semantic analysis error thrown in server while execurint query. Error message from server: Error while compiling statement: FAILED: ParseException line 1:44 character ' (80) (SQLExecDirectW)")

需要注意的是,當我複製打印和在Hue中手動執行,效果很好。我猜測它與變量sTableName的編碼有關,但我無法弄清楚如何解決它。

由於

回答

1

查詢是沒有由於可變sTableName的不正確編碼。 單獨打印變量會正確顯示文本。與上面的打印例子:

>>> print(oQueryDeleteTable) 
>>> 'drop table if exists dl_audit_data_quality.hero_context_start_gamemode;' 

但是打印的原始數據幀,表明它包含這樣的字符:

>>> print(oParameterData.loc[oRow,'TableName'] 
>>> 'h\x00e\x00r\x00o\x00_c\x00o\x00n\x00t\x00e\x00x\x00t\x00' 

問題是通過再加工的編碼解決如下所述:Python Dictionary Contains Encoded Values

import pyodbc 
import pandas 

oConnexionString = 'Driver={ClouderaHive};[...]' 
oConnexion = pyodbc.connect(oConnexionString, autocommit=True) 
oConnexion.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8') 
oConnexion.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') 
oConnexion.setencoding(encoding='utf-8') 
oQueryParameter = "select * from my_db.my_table;" 
oParameterData = pandas.read_sql(oQueryParameter, oConnexion) 
oCursor = oConnexion.cursor() 

for oRow in oParameterData.index: 
    sTableName = oParameterData.loc[oRow,'TableName'] 
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';' 
    print(oQueryDeleteTable) 
    oCursor.execute(oQueryDeleteTable)