2015-12-21 39 views
1

我試圖火花1.4.1使用火花-csv包火花殼處理CSV文件。自定義模式在火花1.4.1

scala> import org.apache.spark.sql.hive.HiveContext                         
import org.apache.spark.sql.hive.HiveContext                           

scala> import org.apache.spark.sql.hive.orc._                           
import org.apache.spark.sql.hive.orc._                            

scala> import org.apache.spark.sql.types.{StructType, StructField, StringType, IntegerType};               
import org.apache.spark.sql.types.{StructType, StructField, StringType, IntegerType}                 

scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)                    
15/12/21 02:06:24 WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.                  
15/12/21 02:06:24 INFO HiveContext: Initializing execution hive, version 0.13.1                  
hiveContext: org.apache.spark.sql.hive.HiveContext = [email protected]             

scala> val customSchema = StructType(Seq(StructField("year", IntegerType, true),StructField("make", StringType, true),StructField("model", StringType, true),StructField("comment", StringType, true),StructField("blank", StringType, true))) 
customSchema: org.apache.spark.sql.types.StructType = StructType(StructField(year,IntegerType,true), StructField(make,StringType,true), StructField(model,StringType,true), StructField(comment,StringType,true), StructField(blank,StringType,true))              

scala> val customSchema = (new StructType).add("year", IntegerType, true).add("make", StringType, true).add("model", StringType, true).add("comment", StringType, true).add("blank", StringType, true) 
:24: error: not enough arguments for constructor StructType: (fields: Array[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType. Unspecified value parameter fields.                             

val customSchema = (new StructType).add("year", IntegerType, true).add("make", StringType, true).add("model", StringType,true).add("comment", StringType, true).add("blank", StringType, true) 

回答

1

據星火1.4.1的文件沒有爲StructType一個無參數的構造函數,這就是爲什麼你所得到的錯誤。您需要升級到1.5.x以獲取無參數構造函數,或者按照第一個示例中的建議創建模式。

val customSchema = StructType(Seq(StructField("year", IntegerType, true),StructField("make", StringType, true),StructField("model", StringType, true),StructField("comment", StringType, true),StructField("blank", StringType, true))) 
+0

謝謝@Glennie Helles Sindholt。他現在的工作:) – Divya

+0

您能否將此問題標記爲已回答,請:) –