IM學習的Hadoop,水槽等,並在項目中的一個,我開始是情感分析,這是確定的,但現在我嘗試通過收集多組數據的擴大,這是我的flume.conf:多個水槽嘰嘰喳喳代理
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS HDFS2
TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = xxx
TwitterAgent.sources.Twitter.consumerSecret = xxxx
TwitterAgent.sources.Twitter.accessToken = xxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxx
TwitterAgent.sources.Twitter.keywords = bbc
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://xxx:8020/user/flume/tweets/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
什麼IM希望實現的是把關於BBC所有的鳴叫在上面的位置,但也可以使用以下配置把有關利物浦的鳴叫到一個單獨的文件夾:
TwitterAgent.sources.Twitter.keywords = liverpool
TwitterAgent.sinks.HDFS2.channel = MemChannel
TwitterAgent.sinks.HDFS2.type = hdfs
TwitterAgent.sinks.HDFS2.hdfs.path = hdfs://xxx:8020/user/flume/tweets/liverpool/
TwitterAgent.sinks.HDFS2.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS2.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS2.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS2.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS2.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel2.type = memory
TwitterAgent.channels.MemChannel2.capacity = 10000
TwitterAgent.channels.MemChannel2.transactionCapacity = 10
這個心不是工作,我不能工作了爲什麼有人能指出我正確的方向?
你看到的錯誤是什麼?你可能已經看過這個[來自cloudera的博客文章](http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop) –
通常我只看到一個代理正在運行,所有將數據發送到一個文件夾 –