2017-05-26 69 views
6

我正在使用maven與scala原型。我得到這個錯誤:

「value $ is not a member of StringContext」

我已經嘗試過在pom.xml中添加幾件事情,但是毫無效果非常好......

我的代碼:

import org.apache.spark.ml.evaluation.RegressionEvaluator 
import org.apache.spark.ml.regression.LinearRegression 
import org.apache.spark.ml.tuning.{ParamGridBuilder, TrainValidationSplit} 
// To see less warnings 
import org.apache.log4j._ 
Logger.getLogger("org").setLevel(Level.ERROR) 


// Start a simple Spark Session 
import org.apache.spark.sql.SparkSession 
val spark = SparkSession.builder().getOrCreate() 

// Prepare training and test data. 
val data = spark.read.option("header","true").option("inferSchema","true").format("csv").load("USA_Housing.csv") 

// Check out the Data 
data.printSchema() 

// See an example of what the data looks like 
// by printing out a Row 
val colnames = data.columns 
val firstrow = data.head(1)(0) 
println("\n") 
println("Example Data Row") 
for(ind <- Range(1,colnames.length)){ 
    println(colnames(ind)) 
    println(firstrow(ind)) 
    println("\n") 
} 

//////////////////////////////////////////////////// 
//// Setting Up DataFrame for Machine Learning //// 
////////////////////////////////////////////////// 

// A few things we need to do before Spark can accept the data! 
// It needs to be in the form of two columns 
// ("label","features") 

// This will allow us to join multiple feature columns 
// into a single column of an array of feautre values 
import org.apache.spark.ml.feature.VectorAssembler 
import org.apache.spark.ml.linalg.Vectors 

// Rename Price to label column for naming convention. 
// Grab only numerical columns from the data 
val df = data.select(data("Price").as("label"),$"Avg Area Income",$"Avg Area House Age",$"Avg Area Number of Rooms",$"Area Population") 

// An assembler converts the input values to a vector 
// A vector is what the ML algorithm reads to train a model 

// Set the input columns from which we are supposed to read the values 
// Set the name of the column where the vector will be stored 
val assembler = new VectorAssembler().setInputCols(Array("Avg Area Income","Avg Area House Age","Avg Area Number of Rooms","Area Population")).setOutputCol("features") 

// Use the assembler to transform our DataFrame to the two columns 
val output = assembler.transform(df).select($"label",$"features") 


// Create a Linear Regression Model object 
val lr = new LinearRegression() 

// Fit the model to the data 

// Note: Later we will see why we should split 
// the data first, but for now we will fit to all the data. 
val lrModel = lr.fit(output) 

// Print the coefficients and intercept for linear regression 
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}") 

// Summarize the model over the training set and print out some metrics! 
// Explore this in the spark-shell for more methods to call 
val trainingSummary = lrModel.summary 

println(s"numIterations: ${trainingSummary.totalIterations}") 
println(s"objectiveHistory: ${trainingSummary.objectiveHistory.toList}") 

trainingSummary.residuals.show() 

println(s"RMSE: ${trainingSummary.rootMeanSquaredError}") 
println(s"MSE: ${trainingSummary.meanSquaredError}") 
println(s"r2: ${trainingSummary.r2}") 

和我pom.xml是:

<project xmlns="http://maven.apache.org/POM/4.0.0" 

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> 
    <modelVersion>4.0.0</modelVersion> 
    <groupId>test</groupId> 
    <artifactId>outrotest</artifactId> 
    <version>1.0-SNAPSHOT</version> 
    <name>${project.artifactId}</name> 
    <description>My wonderfull scala app</description> 
    <inceptionYear>2015</inceptionYear> 
    <licenses> 
    <license> 
     <name>My License</name> 
     <url>http://....</url> 
     <distribution>repo</distribution> 
    </license> 
    </licenses> 

    <properties> 
    <maven.compiler.source>1.6</maven.compiler.source> 
    <maven.compiler.target>1.6</maven.compiler.target> 
    <encoding>UTF-8</encoding> 
    <scala.version>2.11.5</scala.version> 
    <scala.compat.version>2.11</scala.compat.version> 
    </properties> 

    <dependencies> 
    <dependency> 
     <groupId>org.scala-lang</groupId> 
     <artifactId>scala-library</artifactId> 
     <version>${scala.version}</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-mllib_2.11</artifactId> 
     <version>2.0.1</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-core_2.11</artifactId> 
     <version>2.0.1</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-sql_2.11</artifactId> 
     <version>2.0.2</version> 
    </dependency> 
    <dependency> 
     <groupId>com.databricks</groupId> 
     <artifactId>spark-csv_2.11</artifactId> 
     <version>1.5.0</version> 
    </dependency> 

    <!-- Test --> 
    <dependency> 
     <groupId>junit</groupId> 
     <artifactId>junit</artifactId> 
     <version>4.11</version> 
     <scope>test</scope> 
    </dependency> 
    <dependency> 
     <groupId>org.specs2</groupId> 
     <artifactId>specs2-junit_${scala.compat.version}</artifactId> 
     <version>2.4.16</version> 
     <scope>test</scope> 
    </dependency> 
    <dependency> 
     <groupId>org.specs2</groupId> 
     <artifactId>specs2-core_${scala.compat.version}</artifactId> 
     <version>2.4.16</version> 
     <scope>test</scope> 
    </dependency> 
    <dependency> 
     <groupId>org.scalatest</groupId> 
     <artifactId>scalatest_${scala.compat.version}</artifactId> 
     <version>2.2.4</version> 
     <scope>test</scope> 
    </dependency> 
    </dependencies> 

    <build> 
    <sourceDirectory>src/main/scala</sourceDirectory> 
    <testSourceDirectory>src/test/scala</testSourceDirectory> 
    <plugins> 
     <plugin> 
     <!-- see http://davidb.github.com/scala-maven-plugin --> 
     <groupId>net.alchim31.maven</groupId> 
     <artifactId>scala-maven-plugin</artifactId> 
     <version>3.2.0</version> 
     <executions> 
      <execution> 
      <goals> 
       <goal>compile</goal> 
       <goal>testCompile</goal> 
      </goals> 
      <configuration> 
       <args> 
       <!--<arg>-make:transitive</arg>--> 
       <arg>-dependencyfile</arg> 
       <arg>${project.build.directory}/.scala_dependencies</arg> 
       </args> 
      </configuration> 
      </execution> 
     </executions> 
     </plugin> 
     <plugin> 
     <groupId>org.apache.maven.plugins</groupId> 
     <artifactId>maven-surefire-plugin</artifactId> 
     <version>2.18.1</version> 
     <configuration> 
      <useFile>false</useFile> 
      <disableXmlReport>true</disableXmlReport> 
      <!-- If you have classpath issue like NoDefClassError,... --> 
      <!-- useManifestOnlyJar>false</useManifestOnlyJar --> 
      <includes> 
      <include>**/*Test.*</include> 
      <include>**/*Suite.*</include> 
      </includes> 
     </configuration> 
     </plugin> 
    </plugins> 
    </build> 
</project> 

我不知道如何解決它。有人有任何想法嗎?

+1

你試過導入'import sqlContext.implicits._'嗎? –

+0

是的,但它不起作用。它繼續相同的錯誤:「value $不是StringContext的成員」 – Thaise

+0

您需要從pom.xml中刪除spark-csv,因爲這會導致運行時錯誤 – eliasah

回答

17

添加這個..它會工作

val spark = SparkSession.builder().getOrCreate()  
import spark.implicits._ // << add this 
+0

嘿@Thaise,pl將我的答案標記爲推薦的答案 –

5

可以使用col功能,而不是僅僅將其導入這樣的:

import org.apache.spark.sql.functions.col 

,然後更改$"column"col("column")

希望它有助於

0

@A purva的回答最初爲我工作在從的IntelliJ消失的錯誤,但隨後在sbt compile階段

導致"Could not find implicit value for spark"我發現了一個變通的進口spark.implicits._SparkSession從數據幀,而不是由getOrCreate

import df.sparkSession.implicits._ 
獲得一個引用

其中df是一DataFrame

這可能是因爲我的代碼置於該接收的implicit val spark: SparkSession參數case class內;但我不確定爲什麼這個修復工作對我來說