2016-10-28 15 views
1

,但它不工作,我嘗試了2種方式 -李維在批處理模式錯誤引發的錯誤:只有本地Python文件的支持:我試圖通過submiting Python文件在批處理模式下執行<strong>李維</strong>解析的參數

  1. 運行從本地文件系統&也PY文件
  2. 上通過複製那裏...但它不工作HDFS來看,它...

請幫助

[email protected]:/home/tarun/spark/examples/src/main/python$ curl -X POST -H "Content-Type: application/json" tarun-ubuntu:8998/batches --data '{"file": "file:///home/tarun/spark/examples/src/main/python/pi.py", "name": "pipy", "executorCores":1, "executorMemory":"512m", "driverCores":1, "driverMemory":"512m", "queue":"default", "args":["10"]}' 

"requirement failed: Local path /home/tarun/spark/examples/src/main/python/pi.py cannot be added to user sessions." 

讓我感動pi.py在李維HDFS ATLEAST接受捲曲電話:

[email protected]:/home/tarun/spark/examples/src/main/python$ curl -X POST -H "Content-Type: application/json" tarun-ubuntu:8998/batches --data '{"file": "/pi.py", "name": "pipy", "executorCores":1, "executorMemory":"512m", "driverCores":1, "driverMemory":"512m", "queue":"default", "args":["10"]}' 
{"id":20,"state":"running","appId":null,"appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":[]} 

但是,當我檢查了日誌:

$ curl tarun-ubuntu:8998/batches/20/log | python -m json.tool % Total % Received % Xferd Average Speed Time Time  Time Current 
           Dload Upload Total Spent Left Speed 
100 1415 100 1415 0  0 186k  0 --:--:-- --:--:-- --:--:-- 197k 
{ 
    "from": 0, 
    "id": 20, 
    "log": [ 
     "Error: Only local python files are supported: Parsed arguments:", 
     " master     local", 
     " deployMode    client", 
     " executorMemory   512m", 
     " executorCores   1", 
     " totalExecutorCores  null", 
     " propertiesFile   /home/tarun/spark/conf/spark-defaults.conf", 
     " driverMemory   512m", 
     " driverCores    1", 
     " driverExtraClassPath null", 
     " driverExtraLibraryPath null", 
     " driverExtraJavaOptions null", 
     " supervise    false", 
     " queue     default", 
     " numExecutors   null", 
     " files     null", 
     " pyFiles     null", 
     " archives    null", 
     " mainClass    null", 
     " primaryResource   hdfs://localhost:54310/pi.py", 
     " name     pipy", 
     " childArgs    [10]", 
     " jars     null", 
     " packages    null", 
     " packagesExclusions  null", 
     " repositories   null", 
     " verbose     false", 
     "", 
     "Spark properties used, including those specified through", 
     " --conf and those from the properties file /home/tarun/spark/conf/spark-defaults.conf:", 
     " spark.driver.memory -> 512m", 
     " spark.executor.memory -> 512m", 
     " spark.driver.cores -> 1", 
     " spark.master -> local", 
     " spark.executor.cores -> 1", 
     "", 
     " .primaryResource", 
     "Run with --help for usage help or --verbose for debug output" 
    ], 
    "total": 38 
} 

curl tarun-ubuntu:8998/batches/20 | python -m json.tool % Total % Received % Xferd Average Speed Time Time  Time Current 
           Dload Upload Total Spent Left Speed 
100 482 100 482 0  0 105k  0 --:--:-- --:--:-- --:--:-- 117k 
{ 
    "appId": null, 
    "appInfo": { 
     "driverLogUrl": null, 
     "sparkUiUrl": null 
    }, 
    "id": 20, 
    "log": [ 
     "Spark properties used, including those specified through", 
     " --conf and those from the properties file /home/tarun/spark/conf/spark-defaults.conf:", 
     " spark.driver.memory -> 512m", 
     " spark.executor.memory -> 512m", 
     " spark.driver.cores -> 1", 
     " spark.master -> local", 
     " spark.executor.cores -> 1", 
     "", 
     " .primaryResource", 
     "Run with --help for usage help or --verbose for debug output" 
    ], 
    "state": "dead" 
} 

回答

1

錯誤Only local python files are supported是最有可能被Spark拋出,因爲Livy默認將HDFS前綴添加到文件路徑。

兩件事情你應該嘗試:

  1. 添加要在livy.conf從到livy.file.local-dir-whitelist設置讀取您的PY文件的目錄。根據conf文件中的註釋,應用程序「只能在啓動會話時引用遠程URI」。這很可能是Livy在您提交工作時默認爲HDFS的原因。

  2. 當您將file參數傳遞到REST API後,在file:/之後只使用一個斜槓。例如,{"file": "file:/home/tarun/spark/examples/src/main/python/pi.py"}。我相信這是正確的語法。

一件事be aware of在集羣模式下運行時:

Note that the URL should be reachable by the Spark driver process. If running the driver in cluster mode, it may reside on a different host, meaning "file:" URLs have to exist on that node (and not on the client machine).

換句話說,你可能需要在您的PY文件的副本羣集中的每個節點上,以確保驅動程序可以讀取文件。

希望有所幫助。

相關問題