2017-10-19 58 views
1

試圖創建SBT脂肪罐子給出這樣的錯誤:是否在Spark jar中重新編譯了源代碼類來打破sbt的合併?

java.lang.RuntimeException: deduplicate: different file contents found in the following: 
C:\Users\db\.ivy2\cache\org.apache.spark\spark-network-common_2.10\jars\spark-network-common_2.10-1.6.3.jar:com/google/common/base/Function.class 
C:\Users\db\.ivy2\cache\com.google.guava\guava\bundles\guava-14.0.1.jar:com/google/common/base/Function.class 

有很多類,這僅僅是一個例如着想。番石榴14.0.1在兩個廣口瓶在Function.class遊戲版本:

[info] +-com.google.guava:guava:14.0.1 
... 
[info] | | +-com.google.guava:guava:14.0.1 

這意味着SBT /常春藤不會接一個的新版本,但大小和日期是在罈子裏不同,這大概會導致上述錯誤:

$ jar tvf /c/Users/db/.ivy2/cache/org.apache.spark/spark-network-common_2.10/jars/spark-network-common_2.10-1.6.3.jar | grep "com/google/common/base/Function.class" 
    549 Wed Nov 02 16:03:20 CDT 2016 com/google/common/base/Function.class 

$ jar tvf /c/Users/db/.ivy2/cache/com.google.guava/guava/bundles/guava-14.0.1.jar | grep "com/google/common/base/Function.class" 
    543 Thu Mar 14 19:56:52 CDT 2013 com/google/common/base/Function.class 

似乎Apache是​​從源重新編譯Function.class而不是包括類如最初編譯。 這是對這裏發生的事情的正確理解嗎?現在,可以使用sbt排除重新編譯的類,但有沒有 一種構建jar的方法,而不顯式排除每個包含按名稱重新編譯的源代碼的jar?不包括罐明確導致一些 沿着下面的代碼段的線,這使得它看起來,我在這裏會走向錯誤的路線:

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.3" 
    excludeAll(
    ExclusionRule(organization = "com.twitter"), 
    ExclusionRule(organization = "org.apache.spark", name = "spark-network-common_2.10"), 
    ExclusionRule(organization = "org.apache.hadoop", name = "hadoop-client"), 
    ExclusionRule(organization = "org.apache.hadoop", name = "hadoop-hdfs"), 
    ExclusionRule(organization = "org.tachyonproject", name = "tachyon-client"), 
    ExclusionRule(organization = "commons-beanutils", name = "commons-beanutils"), 
    ExclusionRule(organization = "commons-collections", name = "commons-collections"), 
    ExclusionRule(organization = "org.apache.hadoop", name = "hadoop-yarn-api"), 
    ExclusionRule(organization = "org.apache.hadoop", name = "hadoop-yarn-common"), 
    ExclusionRule(organization = "org.apache.curator", name = "curator-recipes") 
) 
, 
libraryDependencies += "org.apache.spark" %% "spark-network-common" % "1.6.3" exclude("com.google.guava", "guava"), 
libraryDependencies += "org.apache.spark" %% "spark-graphx" % "1.6.3", 
libraryDependencies += "com.typesafe.scala-logging" %% "scala-logging-slf4j" % "2.1.2", 
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.2.0" exclude("com.google.guava", "guava"), 
libraryDependencies += "com.google.guava" % "guava" % "14.0.1", 
libraryDependencies += "org.json4s" %% "json4s-native" % "3.2.11", 
libraryDependencies += "org.json4s" %% "json4s-ext" % "3.2.11", 
libraryDependencies += "com.rabbitmq" % "amqp-client" % "4.1.1", 
libraryDependencies += "commons-codec" % "commons-codec" % "1.10", 

如果這是在錯誤的道路,什麼是更清潔的方式?

回答

1

如果這是錯誤的路徑,什麼是更乾淨的方式?

的清潔方法是不打包spark-core的話,那當你在你的目標機器上安裝Spark和將提供給您的應用程序在運行時提供給您(你通常可以找到他們下/usr/lib/spark/jars)。

您應該將這些火花依賴關係標記爲% provided。這應該可以幫助您避免打包這些罐子造成的許多衝突。

+0

這就是我爲在雲中的Spark下運行的應用程序所做的。一些相同的代碼被用作GUI的一部分。沒有任何容器提供。也許這是指向更好的方式來構建GUI的jar? –

+0

@DonBranson我明白了。我絕對不會推薦將Spark包裝成任何超級瓶子的一部分,它會導致你依賴地獄,因爲Spark非常龐大。相反,我會確保容器具有由某些配置管理提供的這些依賴關係,這些配置管理負責將Spark jar解包到那裏。 –

+0

這是一個熱依賴混亂,當然。 GUI沒有容器。我需要建立一個獨立的,我可以與非開發人員分享。有沒有辦法做到這一點?也許我需要用所有的jar包和腳本將它們全部添加到classpath中。 –

相關問題