1
如何從systemML DSL中的HDFS加載csv文件?來自HDFS的SystemML加載文件?
我嘗試了一些這樣的:
X = read("hdfs://ip-XXX-XXX-XXX-XXX:9000/SystemML/data/NN_X_100_10.csv");
我檢查的文件實際上位於這個位置HDFS。
當我通過運行DSL:
$SPARK_HOME/bin/spark-submit ~/Nearest_Neighbour_Search/SystemML/systemml-0.14.0-incubating.jar -f ~/Nearest_Neighbour_Search/SystemML/Task03_NN_SystemML_1000_hdfs.dml
它抱怨說:
ERROR:/home/ubuntu/Nearest_Neighbour_Search/SystemML/Task03_NN_SystemML_1000_hdfs.dml -- line 1, column 0 -- Read input file does not exist on FS (local mode): hdfs://ip-172-30-4-168:9000/SystemML/data/NN_X_1000000_1000.csv
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:367)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:214)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.sysml.parser.LanguageException: Invalid Parameters : ERROR: /home/ubuntu/Nearest_Neighbour_Search/SystemML/Task03_NN_SystemML_1000_hdfs.dml -- line 1, column 0 -- Read input file does not exist on FS (local mode): hdfs://ip-172-30-4-168:9000/SystemML/data/NN_X_1000000_1000.csv
at org.apache.sysml.parser.Expression.raiseValidateError(Expression.java:549)
at org.apache.sysml.parser.DataExpression.validateExpression(DataExpression.java:641)
at org.apache.sysml.parser.StatementBlock.validate(StatementBlock.java:592)
at org.apache.sysml.parser.DMLTranslator.validateParseTree(DMLTranslator.java:143)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:591)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:353)
... 10 more
我認爲這個問題涉及到本地模式,但我不知道如何設置,支持HDFS爲systemML。
任何建議,高度讚賞!
謝謝!
感謝您的回答!我嘗試了新的構建,它的工作! –