2016-12-14 20 views
0

我有一個scala對象文件,它在內部查詢mysql表進行連接並將數據寫入s3,在本地測試了我的代碼,它運行得非常好。但是當我提交給集羣它拋出以下錯誤:下面包含mysql連接器的spark-submit命令

Exception in thread "main" java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:117) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) at QuaterlyAudit$.main(QuaterlyAudit.scala:51) at QuaterlyAudit.main(QuaterlyAudit.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

是我sparksubmit命令:

nohup spark-submit --class QuaterlyAudit --master yarn-client --num-executors 8 
--driver-memory 16g --executor-memory 20g --executor-cores 10 /mypath/campaign.jar & 

我使用SBT,我包括在SBT裝配mysql的連接器,下面是我的構建。 SBT文件:

name := "mobilewalla" 

version := "1.0" 

scalaVersion := "2.11.8" 

libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "2.0.0" % "provided", 
    "org.apache.spark" %% "spark-sql" % "2.0.0" % "provided", 
    "org.apache.hadoop" % "hadoop-aws" % "2.6.0" intransitive(), 
    "mysql" % "mysql-connector-java" % "5.1.37") 

assemblyMergeStrategy in assembly := { 
    case PathList("META-INF", [email protected]_*) => 
    xs.map(_.toLowerCase) match { 
     case ("manifest.mf" :: Nil) | 
     ("index.list" :: Nil) | 
     ("dependencies" :: Nil) | 
     ("license" :: Nil) | 
     ("notice" :: Nil) => MergeStrategy.discard 
    case _ => MergeStrategy.first // was 'discard' previousely 
} 
    case "reference.conf" => MergeStrategy.concat 
    case _ => MergeStrategy.first 
} 
assemblyJarName in assembly := "campaign.jar" 

我也試圖與:

nohup spark-submit --driver-class-path /mypath/mysql-connector-java-5.1.37.jar 
--class QuaterlyAudit --master yarn-client --num-executors 8 --driver-memory 16g 
--executor-memory 20g --executor-cores 10 /mypath/campaign.jar & 

但仍然沒有運氣,我在這裏錯過了什麼。

回答

0

Spark顯然不能獲取JDBC JAR。可以修復的工作很少。毫無疑問,很多人都面臨這個問題。這是由於Jar沒有上傳到驅動程序和執行程序。

  1. 您可能要裝配你與你的生成管理(Maven的,SBT),因此你不會需要添加依賴條件在​​CLI應用程序。
  2. 您可以在​​CLI中使用下列選項: --jars $(echo ./lib/*.jar | tr ' ' ',')
  3. 您也可以嘗試配置這2個變量:在SPARK_HOME/conf目錄/火花default.conf文件spark.driver.extraClassPathspark.executor.extraClassPath,並指定這些變量的值作爲jar文件的路徑。確保工作節點上存在相同的路徑。
+0

它通過'.option(「driver」,「com.mysql.jdbc.Driver」)''sqlContext.read'命令得到修復我用過'nohup spark-submit --class QuaterlyAudit - master yarn-client --num-executors 8 --driver-memory 16g --executor -memory 20g --executor-cores 10 /mypath/campaign.jar&'命令和它的工作 – toofrellik

+0

您已經在構建期間添加了jar只使用build build manager。這也可能是我已經在第一點提到的方式 –