我應該將Phoenix數據讀入pyspark。PySpark HBase/Phoenix集成
編輯: 我使用HBase的星火轉換器:
下面的代碼片段:
port="2181"
host="zookeperserver"
keyConv = "org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter"
valueConv = "org.apache.spark.examples.pythonconverters.HBaseResultToStringConverter"
cmdata_conf = {"hbase.zookeeper.property.clientPort":port, "hbase.zookeeper.quorum": host, "hbase.mapreduce.inputtable": "camel", "hbase.mapreduce.scan.columns": "data:a"}
sc.newAPIHadoopRDD("org.apache.hadoop.hbase.mapreduce.TableInputFormat","org.apache.hadoop.hbase.io.ImmutableBytesWritable","org.apache.hadoop.hbase.client.Result",keyConverter=keyConv,valueConverter=valueConv,conf=cmdata_conf)
回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/hdp/2.3.0.0-2557/spark/python/pyspark/context.py", line 547, in newAPIHadoopRDD
jconf, batchSize)
File "/usr/hdp/2.3.0.0-2557/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/usr/hdp/2.3.0.0-2557/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.io.IOException: No table was provided.
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:130)
任何幫助將非常感激。
謝謝! /蒂娜
我嘗試了第二種方式,但即時得到一個錯誤: Py4JJavaError:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD:同時呼籲ž發生錯誤。 :java.io.IOException:未提供表。 你在PYSPARK做過這些嗎? – Ranic
有你提供合適的配置,以星火newAPIHadoopRDD如下: sparkconf = { 「hbase.zookeeper.quorum」:zookeeperhost, 「hbase.mapreduce.inputtable」:sampletable, 「hbase.mapreduce.scan.columns」 :列} hbase_rdd = sc.newAPIHadoopRDD( 「org.apache.hadoop.hbase.mapreduce.TableInputFormat」, 「org.apache.hadoop.hbase.io.ImmutableBytesWritable」, 「org.apache.hadoop.hbase.client .Result「, keyConverter = keyConv, valueConverter = valueConv, conf = sparkconf) –
請嘗試上面的方法,我認爲你沒有在配置中提供表名。還有t他的keyConv和valueConv的值分別爲examples.pythonconverters.ImmutableBytesWritableToStringConverter和examples.pythonconverters.HBaseResultToStringConverter分別爲 –