2016-03-19 70 views
0

我試圖做使用Twitter流API鳴叫一些分析鳴叫流。無法獲得使用Twitter流API的星火

我首先想從流打印狀態消息,並從那裏開始。

我的代碼如下所示:

public static void main(String[] args) { 
    SparkConf conf = new SparkConf().setAppName("TwitterStreamPrinter").setMaster("local"); 

    Configuration twitterConf = new ConfigurationBuilder() 
     .setOAuthConsumerKey(consumerKey) 
     .setOAuthConsumerSecret(consumerSecret) 
     .setOAuthAccessToken(accessToken) 
     .setOAuthAccessTokenSecret(accessTokenSecret).build(); 
    OAuth2Authorization auth = new OAuth2Authorization(twitterConf); 
    JavaReceiverInputDStream<Status> twitterStream = TwitterUtils.createStream(ssc, auth); 

    JavaDStream<String> statuses = twitterStream.map(new Function<Status, String>() { 
    public String call(Status status) throws Exception { 
     return status.getText(); 
    } 
    }); 
    statuses.print(); 

它不會打印出比Spark日誌其他任何東西。我最初以爲這是因爲授權,所以我嘗試了各種不同的方式來通過授權,但也許這不是授權。

我看着每一個例子,我可以從網絡上找到的(雖然有沒有很多),這種代碼看起來像一個標準的代碼來獲取Twitter的狀態,但它爲什麼不打印什麼?我也試過System.out.println,但它沒有奏效。

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
16/03/19 12:02:23 INFO SparkContext: Running Spark version 1.6.1 
16/03/19 12:02:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/03/19 12:02:24 INFO SecurityManager: Changing view acls to: abcd 
16/03/19 12:02:24 INFO SecurityManager: Changing modify acls to: abcd 
16/03/19 12:02:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(abcd); users with modify permissions: Set(abcd) 
16/03/19 12:02:24 INFO Utils: Successfully started service 'sparkDriver' on port 50995. 
16/03/19 12:02:24 INFO Slf4jLogger: Slf4jLogger started 
16/03/19 12:02:25 INFO Remoting: Starting remoting 
16/03/19 12:02:25 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:51003] 
16/03/19 12:02:25 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 51003. 
16/03/19 12:02:25 INFO SparkEnv: Registering MapOutputTracker 
16/03/19 12:02:25 INFO SparkEnv: Registering BlockManagerMaster 
16/03/19 12:02:25 INFO DiskBlockManager: Created local directory at /private/var/folders/3b/wzflbsn146qgwdglbm_6ms3m0000hl/T/blockmgr-e3de07a6-0c62-47cf-9940-da18382c9241 
16/03/19 12:02:25 INFO MemoryStore: MemoryStore started with capacity 2.4 GB 
16/03/19 12:02:25 INFO SparkEnv: Registering OutputCommitCoordinator 
16/03/19 12:02:25 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/03/19 12:02:25 INFO SparkUI: Started SparkUI at http://10.0.0.12:4040 
16/03/19 12:02:25 INFO Executor: Starting executor ID driver on host localhost 
16/03/19 12:02:25 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51016. 
16/03/19 12:02:25 INFO NettyBlockTransferService: Server created on 51016 
16/03/19 12:02:25 INFO BlockManagerMaster: Trying to register BlockManager 
16/03/19 12:02:25 INFO BlockManagerMasterEndpoint: Registering block manager localhost:51016 with 2.4 GB RAM, BlockManagerId(driver, localhost, 51016) 
16/03/19 12:02:25 INFO BlockManagerMaster: Registered BlockManager 
16/03/19 12:02:25 WARN StreamingContext: spark.master should be set as local[n], n > 1 in local mode if you have receivers to get data, otherwise Spark jobs will not get resources to process the received data. 
16/03/19 12:02:26 INFO SparkContext: Invoking stop() from shutdown hook 
16/03/19 12:02:26 INFO SparkUI: Stopped Spark web UI at http://10.0.0.12:4040 
16/03/19 12:02:26 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
16/03/19 12:02:26 INFO MemoryStore: MemoryStore cleared 
16/03/19 12:02:26 INFO BlockManager: BlockManager stopped 
16/03/19 12:02:26 INFO BlockManagerMaster: BlockManagerMaster stopped 
16/03/19 12:02:26 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
16/03/19 12:02:26 INFO SparkContext: Successfully stopped SparkContext 
16/03/19 12:02:26 INFO ShutdownHookManager: Shutdown hook called 
16/03/19 12:02:26 INFO ShutdownHookManager: Deleting directory /private/var/folders/3b/..... 

回答

1

你在你的日誌中的一切:

19年6月3日12時02分25秒WARN的StreamingContext:spark.master應設置爲本地[N],正以本地模式如果> 1你有接收器來獲取數據,否則Spark作業將不會獲取資源來處理接收到的數據。

所以答案設置主站是本地[*]

除了

,有你忘了啓動?

jssc.start(); //開始計算

jssc.awaitTermination();

+0

改變當地[*]刪除WARN消息,但仍然無法打印任何東西。 – user2418202

+0

所以你可以更新代碼和日誌?你有多少核心? –

+0

完全相同的日誌沒有WARN消息。就我所知,輸出不應取決於內核的數量。 – user2418202