2015-05-25 16 views
9

我運行Spark應用程序並想將測試類打包到胖罐中。奇怪的是我跑了「sbt大會」成功,但失敗時,我跑「sbt測試:大會」。我試過sbt-assembly : including test classes,它不適合我的情況。spark + sbt-assembly:「重複數據刪除:在以下內容中找到不同的文件內容」

SBT版本:0.13.8

build.sbt:

import sbtassembly.AssemblyPlugin._ 

name := "assembly-test" 

version := "1.0" 

scalaVersion := "2.10.5" 

libraryDependencies ++= Seq(
    ("org.apache.spark" % "spark-core_2.10" % "1.3.1" % Provided) 
    .exclude("org.mortbay.jetty", "servlet-api"). 
    exclude("commons-beanutils", "commons-beanutils-core"). 
    exclude("commons-collections", "commons-collections"). 
    exclude("commons-logging", "commons-logging"). 
    exclude("com.esotericsoftware.minlog", "minlog").exclude("com.codahale.metrics", "metrics-core"), 
    "org.json4s" % "json4s-jackson_2.10" % "3.2.10" % Provided, 
    "com.google.inject" % "guice" % "4.0" 
) 

Project.inConfig(Test)(assemblySettings) 

回答

13

您將在裝配定義mergeStratey,因爲我就是這樣做的我下面的火花應用程序。

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => 
    { 
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last 
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last 
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last 
    case PathList("com", "google", xs @ _*) => MergeStrategy.last 
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last 
    case PathList("com", "codahale", xs @ _*) => MergeStrategy.last 
    case PathList("com", "yammer", xs @ _*) => MergeStrategy.last 
    case "about.html" => MergeStrategy.rename 
    case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last 
    case "META-INF/mailcap" => MergeStrategy.last 
    case "META-INF/mimetypes.default" => MergeStrategy.last 
    case "plugin.properties" => MergeStrategy.last 
    case "log4j.properties" => MergeStrategy.last 
    case x => old(x) 
    } 
} 
+0

把所有這些東西在SBT文件,並增加了更多的「排除(......)」條款,可以生成JAR和測試類也都在罐子裏,但是我發現「提供的」不工作 – Grant

+0

「提供」僅在您通過spark-submit提交您的Spark應用程序時才需要。如果您直接運行Spark應用程序,請勿使用「提供」。 –

20

作爲除了韋斯利米蘭的答案,需要的代碼以適應位爲SBT-Assembly插件的更新版本(即0.13.0),如果有人想知道廢棄警告:

assemblyMergeStrategy in assembly := { 
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last 
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last 
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last 
    case PathList("com", "google", xs @ _*) => MergeStrategy.last 
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last 
    case PathList("com", "codahale", xs @ _*) => MergeStrategy.last 
    case PathList("com", "yammer", xs @ _*) => MergeStrategy.last 
    case "about.html" => MergeStrategy.rename 
    case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last 
    case "META-INF/mailcap" => MergeStrategy.last 
    case "META-INF/mimetypes.default" => MergeStrategy.last 
    case "plugin.properties" => MergeStrategy.last 
    case "log4j.properties" => MergeStrategy.last 
    case x => 
     val oldStrategy = (assemblyMergeStrategy in assembly).value 
     oldStrategy(x) 
} 
+9

我已經使用了一年多的Scala,並且我不知道這段代碼的代碼,但重要的是它的工作原理。謝謝 –

+0

感謝和這個解決方案工作得很好 –

+0

@FelipeAlmeida你似乎有經驗的火花,所以我想知道你是否可以幫我一下...我試圖從我的SBT項目創建一個jar文件來運行它。你知道我該怎麼做嗎? – CapturedTree