0
我試圖從hdfs中讀取R中的數據。我在使用sparklyr
時遇到的一件事是解密錯誤信息......因爲我不是一個java程序員。在sparklyr中使用spark_read_csv時發生錯誤「對象的方法csv無效」
考慮這個例子:
要這樣做中的R創建鮑魚數據幀 - 鮑魚是用於機器學習實例的數據集
load pivotal R package #contains abalone data and create dataframe
if (!require(PivotalR)){
install.packages(PivotalR) }
data(abalone)
#sample of data
head(abalone)
#export data to a CSV file
if (!require(readr)){
install.packages(readr) }
write_csv(abalone,'abalone.csv')
要這樣做在命令行
hdfs dfs -put abalone.csv abalone.csv
#check to see if the file is on the hdfs
hdfs dfs -ls
DO這在R 這是設置使用您當前版本的火花 您可能必須更改spark_home
library(sparklyr)
library(SparkR)
sc = spark_connect(master = 'yarn-client',
spark_home = '/usr/hdp/current/spark-client',
app_name = 'sparklyr',
config = list(
"sparklyr.shell.executor-memory" = "1G",
"sparklyr.shell.driver-memory" = "4G",
"spark.driver.maxResultSize" = "2G" # may need to transfer a lot of data into R
)
)
讀入我們剛寫入HDFS的鮑魚文件。 您必須更改路徑以符合您的路徑。
df <- spark_read_csv(sc,name='abalone',path='hdfs://pnhadoop/user/stc004/abalone.csv',delimiter=",",
header=TRUE)
,我發現了以下錯誤:
Error: java.lang.IllegalArgumentException: invalid method csv for object 63
at sparklyr.Invoke$.invoke(invoke.scala:113)
at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
at sparklyr.StreamHandler$.read(stream.scala:55)
at sparklyr.BackendHandler.channelRead0(handler.scala:49)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
不知道發生了什麼事情。我以前使用spark_read_csv
沒有錯誤。我不知道如何解讀Java錯誤。思考?
首先我會檢查的就是訪問權限的文件,是理所當然? – mrjoseph