2013-07-18 33 views

回答

6

可以使用蜂巢庫訪問蜂巢蟒蛇,對於你想從蜂巢進口ThriftHive

進口蜂巢類 下面的例子

import sys 

from hive import ThriftHive 
from hive.ttypes import HiveServerException 

from thrift import Thrift 
from thrift.transport import TSocket 
from thrift.transport import TTransport 
from thrift.protocol import TBinaryProtocol 

try: 
    transport = TSocket.TSocket('localhost', 10000) 
    transport = TTransport.TBufferedTransport(transport) 
    protocol = TBinaryProtocol.TBinaryProtocol(transport) 
    client = ThriftHive.Client(protocol) 
    transport.open() 
    client.execute("CREATE TABLE r(a STRING, b INT, c DOUBLE)") 
    client.execute("LOAD TABLE LOCAL INPATH '/path' INTO TABLE r") 
    client.execute("SELECT * FROM r") 
    while (1): 
    row = client.fetchOne() 
    if (row == None): 
     break 
    print row 

    client.execute("SELECT * FROM r") 
    print client.fetchAll() 
    transport.close() 
except Thrift.TException, tx: 
    print '%s' % (tx.message) 
+0

PLZ告訴我怎麼去蜂巢庫? –

+0

您可以從$ HIVE_HOME/lib/py/*將該內容複製到py文件夾中的內容並將其過去到Python庫 – Sreejith

+0

@Sreejith我沒有問題導入這些python庫,但是,執行hive命令後代碼會掛起。結果是一個普遍的問題。你的代碼是否連接到Hiveserver1或Hiveserver2? https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/lCSuh6vLmHM –

9

要安裝你需要這些庫:

pip install sasl 
pip install thrift 
pip install thrift-sasl 
pip install PyHive 

如果您在Linux上,則可能需要在運行runn之前單獨安裝SASL以上。使用apt-getyum或任何軟件包管理器安裝包libsasl2-dev。對於Windows,有一些選項on GNU.org。在Mac SASL應可如果你已經安裝了Xcode開發工具(xcode-select --install

安裝後,可以執行這樣的蜂巢查詢:

from pyhive import hive 
conn = hive.Connection(host="YOUR_HIVE_HOST", port=PORT, username="YOU") 

現在,你有蜂巢連接,有選擇如何使用它。你可以只直線上升查詢:

cursor = conn.cursor() 
cursor.execute("SELECT cool_stuff FROM hive_table") 
for result in cursor.fetchall(): 
    use_result(result) 

...或使用該連接撥打大熊貓數據幀:

import pandas as pd 
df = pd.read_sql("SELECT cool_stuff FROM hive_table", conn) 
+1

「無法啓動SASL:%s」%self.sasl.getError()「 - 在Windows 2008下R2,python 3.6。如何解決這個問題? –