2017-10-13 18 views
0

我想從套接字流中獲取每條記錄。我希望記錄是來自行的字符串數據類型。如何在python中編寫代碼?謝謝!如何從socketTextStream獲取字符串格式的記錄

模型= pipeline.PipelineModel.read()。負載(model_path)

SC = spark.sparkContext SSC =的StreamingContext(SC,1)

線= ssc.socketTextStream(sys.argv中[ 1],INT(sys.argv中[2]))

如果(線不是無): lines.foreachRDD(拉姆達RDD:rdd.foreach(processRecord))

DEF processRecord(記錄):

print("test") 
... 

回答

0
from __future__ import print_function 
import sys 
from pyspark import SparkContext 
from pyspark.streaming import StreamingContext 


if __name__ == "__main__": 
    sc = SparkContext(appName="Demo") 
    ssc = StreamingContext(sc, 1) 

    #record = ssc.socketTextStream("localhost", 9999) 
    record = ssc.socketTextStream(sys.argv[1], int(sys.argv[2])) 
    # print out each single word 
    record.flatMap(lambda line: line.split(" ")).pprint() 

    # start streaming 
    ssc.start() 
    # stop when the socket we are listening is dead 
    ssc.awaitTermination() 

謝謝。

+0

記錄不是字符串類型 – icecream

+0

我在那裏添加了更多的代碼。請檢查我的代碼有什麼問題。謝謝! – icecream

相關問題