2015-09-04 80 views
4

我嘗試在Apache Zeppelin中運行以下簡單命令。如何在Zeppelin中將Flink var的內容寫入屏幕?

%flink 

var rabbit = env.fromElements(
"ARTHUR: What, behind the rabbit?", 
"TIM: It is the rabbit!", 
"ARTHUR: You silly sod! You got us all worked up!", 
"TIM: Well, that's no ordinary rabbit. That's the most foul, cruel, and bad-tempered rodent you ever set eyes on.", 
"ROBIN: You tit! I soiled my armor I was so scared!", 
"TIM: Look, that rabbit's got a vicious streak a mile wide, it's a killer!") 

var counts = rabbit.flatMap { _.toLowerCase.split("\\W+")}.map{ (_,1)}.groupBy(0).sum(1) 

counts.print() 

我試着在筆記本上打印出結果。但不幸的是,我只得到以下輸出。

rabbit: org.apache.flink.api.scala.DataSet[String] = [email protected] 
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = [email protected] 
res103: org.apache.flink.api.java.operators.DataSink[(String, Int)] = DataSink '<unnamed>' (Print to System.out) 

如何將計數內容泄漏到Zeppelin筆記本中?

回答

4

的原因觀察到的行爲在於Apache的飛艇和Apache弗林克之間的相互作用。齊柏林捕獲Console的所有標準輸出。但是,Flink還打印輸出到System.out,這正是您撥打counts.print()時發生的情況。 bzz解決方案的工作原理是使用Console打印結果。

我打開了一個JIRA問題[1]並打開了一個拉取請求[2]來糾正這種行爲,這樣您也可以使用counts.print()

5

打印在齊柏林這樣計算的結果的方式是:

%flink 
counts.collect().foreach(println(_)) 

//or one might prefer 
//counts.collect foreach println 

輸出:

(a,3) 
(all,1) 
(and,1) 
(armor,1) 
...