2016-08-22 42 views
0

我試圖連接與highcharts飛艇的成員..couldnot Apache的飛艇連接到highcharts錯誤:值seriesCol不是org.apache.spark.sql.DataFrame

%spark 
import com.knockdata.zeppelin.highcharts._ 
import com.knockdata.zeppelin.highcharts.model._ 
import sqlContext.implicits._ 

val Tokyo = Seq(7.0, 6.9, 9.5, 14.5, 18.2, 21.5, 25.2, 26.5, 23.3, 
18.3, 13.9, 9.6).map(("Tokyo", _)) 

val df = (Tokyo).toDF("city", "temperature") 

df.show() 

highcharts(df.seriesCol("city").series("y" -> col("temperature"))).plot() 

這給

在火花解釋
import com.knockdata.zeppelin.highcharts._ 
import com.knockdata.zeppelin.highcharts.model._ 
import sqlContext.implicits._ 
Tokyo: Seq[(String, Double)] = List((Tokyo,7.0), (Tokyo,6.9), (Tokyo,9.5), (Tokyo,14.5), (Tokyo,18.2), (Tokyo,21.5), (Tokyo,25.2), (Tokyo,26.5), (Tokyo,23.3), (Tokyo,18.3), (Tokyo,13.9), (Tokyo,9.6)) 
df: org.apache.spark.sql.DataFrame = [city: string, temperature: double] 
+-----+-----------+ 
| city|temperature| 
+-----+-----------+ 
|Tokyo|  7.0| 
|Tokyo|  6.9| 
|Tokyo|  9.5| 
|Tokyo|  14.5| 
|Tokyo|  18.2| 
|Tokyo|  21.5| 
|Tokyo|  25.2| 
|Tokyo|  26.5| 
|Tokyo|  23.3| 
|Tokyo|  18.3| 
|Tokyo|  13.9| 
|Tokyo|  9.6| 
+-----+-----------+ 
<console>:201: error: value seriesCol is not a member of org.apache.spark.sql.DataFrame 
       highcharts(df.seriesCol("city").series("y" -> col("temperature"))).plot() 

我已添加的依賴性僞影作爲com.knockdata:zeppelin-highcharts:0.2

隨後https://github.com/knockdata/zeppelin-highcharts/blob/master/docs/DemoLineChart.md 和使用試圖銀行數據但得到

<console>:224: error: value series is not a member of org.apache.spark.rdd.RDD[Bank] 
possible cause: maybe a semicolon is missing before `value series'? 
       .series("x" -> "age", "y" -> avg(col("income"))) 

請幫我在哪裏我錯了?可能是什麼問題呢? 在此先感謝

回答

0

DataFrame可以隱式轉換爲具有函數seriesCol的SeriesHolder。它是在0.6.0版本中添加的。

df.seriesCol("city") 

該錯誤應與使用錯誤版本的spark-highcharts相關。示例代碼(doc)對應於版本0.6.0(直接映射到zeppelin版本)。

使用docker可能是最簡單的方法。或者用類似的方式,如Dockerfile

docker run -p 8080:8080 -d knockdata/zeppelin-highcharts 
+0

我使用飛艇0.6.0只。他們說:「如果你想運行你現有的飛艇,按照使用Zeppelin.',我需要再次使用碼頭?以及爲什麼銀行數據不起作用? –

0

我改變火花翻譯的依賴性神器com.knockdata:zeppelin-highcharts:0.2com.knockdata:zeppelin-highcharts:0.6.0來解決這個問題。但銀行的數據問題依然存在..任何幫助?

%spark 
import com.knockdata.zeppelin.highcharts._ 
import com.knockdata.zeppelin.highcharts.model._ 
import sqlContext.implicits._ 

val bankText = sc.textFile("/home/priyanka/Downloads/bank-data.csv") 

case class Bank(age:Integer, region:String, income : Float, married : String, children : Integer, car:String, save_act:String, current_act : String, mortgage : String, pep : String) 

// split each line, filter out header (starts with "age"), and map it into Bank case class 
val bank = bankText.map(s=>s.split(",")).filter(s=>s(0)!="age").map(
    s=>Bank(s(0).toInt, 
      s(1).replaceAll("\"", ""), 
      s(2).replaceAll("\"", "").toFloat, 
      s(3).replaceAll("\"", ""), 
      s(4).replaceAll("\"", "").toInt, 
      s(5).replaceAll("\"", ""), 
      s(6).replaceAll("\"", ""), 
      s(7).replaceAll("\"", ""),  
      s(8).replaceAll("\"", ""), 
      s(9).replaceAll("\"", "") 
     ) 
) 

// convert to DataFrame and create temporal table 
bank.toDF().registerTempTable("bank") 

highcharts(bank.series("x" -> "age", "y" -> avg(col("income"))).orderBy(col("age"))).plot() 

這給

import com.knockdata.zeppelin.highcharts._ 
import com.knockdata.zeppelin.highcharts.model._ 
import sqlContext.implicits._ 
bankText: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[49] at textFile at <console>:62 
defined class Bank 
bank: org.apache.spark.rdd.RDD[Bank] = MapPartitionsRDD[52] at map at <console>:66 
<console>:70: error: value series is not a member of org.apache.spark.rdd.RDD[Bank] 
possible cause: maybe a semicolon is missing before `value series'? 
       .series("x" -> "age", "y" -> avg(col("income"))) 
       ^

謝謝

+0

非常感謝您使用它。我是作者。這與版本問題有關。在zeppelin 0.6上使用zeppelin-highcart:0.6.0。 (以前的文檔是說zeppelin-highcharts:0.6.0-SNAPSHOT,界面被改變了,我有正確的文檔) –

+0

非常感謝你..謝謝你給這個:) ..任何想法什麼是錯誤的閱讀一份文件?銀行數據? 'org.apache.spark.rdd.RDD'和'org.apache.spark.sql.DataFrame'之間的一些衝突? –

+0

銀行需要是一個DataFrame。將'toDF()'移到銀行定義的末尾 –

相關問題