我是Hadoop初學者。我的設置:RHEL7,hadoop-2.7.3Hadoop字數統計示例 - 空指針異常
我試圖運行Example:_WordCount_v2.0。我只是將源代碼複製到新的eclipse項目並將其導出到wc.jar文件。
現在,我已將hadoop Pseudo-Distributed Operation配置爲鏈接中的鏈接。然後,我開始通過以下:
在輸入目錄中創建輸入文件:
echo "Hello World, Bye World!" > input/file01
echo "Hello Hadoop, Goodbye to hadoop." > input/file02
開始ENV:
sbin/start-dfs.sh
bin/hdfs dfs -mkdir /user
bin/hdfs dfs -mkdir /user/<username>
bin/hdfs dfs -put input input
bin/hadoop jar ws.jar WordCount2 input output
,這是我得到了什麼:
16/09/02 13:15:01 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/09/02 13:15:01 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/09/02 13:15:01 INFO input.FileInputFormat: Total input paths to process : 2
16/09/02 13:15:01 INFO mapreduce.JobSubmitter: number of splits:2
16/09/02 13:15:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local455553963_0001
16/09/02 13:15:01 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/09/02 13:15:01 INFO mapreduce.Job: Running job: job_local455553963_0001
16/09/02 13:15:01 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/09/02 13:15:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/09/02 13:15:01 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
16/09/02 13:15:02 INFO mapred.LocalJobRunner: Waiting for map tasks
16/09/02 13:15:02 INFO mapred.LocalJobRunner: Starting task: attempt_local455553963_0001_m_000000_0
16/09/02 13:15:02 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/09/02 13:15:02 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
16/09/02 13:15:02 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/aii/input/file02:0+33
16/09/02 13:15:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/09/02 13:15:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/09/02 13:15:02 INFO mapred.MapTask: soft limit at 83886080
16/09/02 13:15:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/09/02 13:15:02 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/09/02 13:15:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/09/02 13:15:02 INFO mapred.MapTask: Starting flush of map output
16/09/02 13:15:02 INFO mapred.LocalJobRunner: Starting task: attempt_local455553963_0001_m_000001_0
16/09/02 13:15:02 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/09/02 13:15:02 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
16/09/02 13:15:02 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/aii/input/file01:0+24
16/09/02 13:15:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/09/02 13:15:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/09/02 13:15:02 INFO mapred.MapTask: soft limit at 83886080
16/09/02 13:15:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/09/02 13:15:02 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/09/02 13:15:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/09/02 13:15:02 INFO mapred.MapTask: Starting flush of map output
16/09/02 13:15:02 INFO mapred.LocalJobRunner: map task executor complete.
16/09/02 13:15:02 WARN mapred.LocalJobRunner: job_local455553963_0001
java.lang.Exception: java.lang.NullPointerException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NullPointerException
at WordCount2$TokenizerMapper.setup(WordCount2.java:47)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/09/02 13:15:02 INFO mapreduce.Job: Job job_local455553963_0001 running in uber mode : false
16/09/02 13:15:02 INFO mapreduce.Job: map 0% reduce 0%
16/09/02 13:15:02 INFO mapreduce.Job: Job job_local455553963_0001 failed with state FAILED due to: NA
16/09/02 13:15:02 INFO mapreduce.Job: Counters: 0
無結果(輸出)被給出。爲什麼我得到這個異常?
感謝
編輯:
感謝解決方案建議我已經意識到,還有第二次嘗試(在例子的wordCount):
echo "\." > patterns.txt
echo "\," >> patterns.txt
echo "\!" >> patterns.txt
echo "to" >> patterns.txt
,然後運行:
bin/hadoop jar ws.jar WordCount2 -Dwordcount.case.sensitive=true input output -skip patterns.txt
和一切都是工作gr吃!
非常感謝!你可以在我的問題中看到編輯 - 我從你的答案中找出它:-) – ItayB