是否執行定義映射程序的幫助,以及是否未執行,出於何種原因。我從數據庫中將讀取方式的輸出寫入執行映射器的本地文件系統的文本文件中。在這裏,我給一個代碼確定映射程序的執行
package org.myorg;
import java.io.*;
import java.util.*;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.logging.Level;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class ParallelIndexation {
public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, LongWritable> {
private final static LongWritable zero = new LongWritable(0);
private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, LongWritable> output, Reporter reporter)
throws IOException {
Configuration conf = new Configuration();
int CountComputers;
FileInputStream fstream = new FileInputStream(
"/export/hadoop-1.0.1/bin/countcomputers.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String result=br.readLine();
CountComputers=Integer.parseInt(result);
input.close();
fstream.close();
Connection con = null;
Statement st = null;
ResultSet rs = null;
String url = "jdbc:postgresql://192.168.1.8:5432/NexentaSearch";
String user = "postgres";
String password = "valter89";
ArrayList<String> paths = new ArrayList<String>();
try
{
con = DriverManager.getConnection(url, user, password);
st = con.createStatement();
rs = st.executeQuery("select path from tasks order by id");
while (rs.next()) { paths.add(rs.getString(1)); };
PrintWriter zzz = null;
try
{
zzz = new PrintWriter(new FileOutputStream("/export/hadoop-1.0.1/bin/readwaysfromdatabase.txt"));
}
catch(FileNotFoundException e)
{
System.out.println("Error");
System.exit(0);
}
for (int i=0; i<paths.size(); i++)
{
zzz.println("paths[i]=" + paths.get(i) + "\n");
}
zzz.close();
}
catch (SQLException e)
{
System.out.println("Connection Failed! Check output console");
e.printStackTrace();
}
但是,儘管它的/export/hadoop-1.0.1/bin/readwaysfromdatabase.txt文件未創建下屬的一個節點。無論從這裏跟隨,什麼映射器都沒有被執行?我也把輸出到
args[0]=/export/hadoop-1.0.1/bin/input
13/04/22 14:00:53 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/04/22 14:00:53 INFO mapred.FileInputFormat: Total input paths to process : 0
13/04/22 14:00:54 INFO mapred.JobClient: Running job: job_201304221331_0003
13/04/22 14:00:55 INFO mapred.JobClient: map 0% reduce 0%
13/04/22 14:01:12 INFO mapred.JobClient: map 0% reduce 100%
13/04/22 14:01:17 INFO mapred.JobClient: Job complete: job_201304221331_0003
13/04/22 14:01:17 INFO mapred.JobClient: Counters: 15
13/04/22 14:01:17 INFO mapred.JobClient: Job Counters
13/04/22 14:01:17 INFO mapred.JobClient: Launched reduce tasks=1
13/04/22 14:01:17 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=9079
13/04/22 14:01:17 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/04/22 14:01:17 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/04/22 14:01:17 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=7983
13/04/22 14:01:17 INFO mapred.JobClient: File Output Format Counters
13/04/22 14:01:17 INFO mapred.JobClient: Bytes Written=0
13/04/22 14:01:17 INFO mapred.JobClient: FileSystemCounters
13/04/22 14:01:17 INFO mapred.JobClient: FILE_BYTES_WRITTEN=21536
13/04/22 14:01:17 INFO mapred.JobClient: Map-Reduce Framework
13/04/22 14:01:17 INFO mapred.JobClient: Reduce input groups=0
13/04/22 14:01:17 INFO mapred.JobClient: Combine output records=0
13/04/22 14:01:17 INFO mapred.JobClient: Reduce shuffle bytes=0
13/04/22 14:01:17 INFO mapred.JobClient: Reduce output records=0
13/04/22 14:01:17 INFO mapred.JobClient: Spilled Records=0
13/04/22 14:01:17 INFO mapred.JobClient: Total committed heap usage (bytes)=16252928
13/04/22 14:01:17 INFO mapred.JobClient: Combine input records=0
13/04/22 14:01:17 INFO mapred.JobClient: Reduce input records=0
我也帶來了一個輸出到程序的成功執行的文件,一個虛擬機
12/10/28 10:41:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/10/28 10:41:14 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/10/28 10:41:14 INFO mapred.FileInputFormat: Total input paths to process : 1
12/10/28 10:41:15 INFO mapred.JobClient: Running job: job_local_0001
12/10/28 10:41:15 INFO mapred.Task: Using ResourceCalculatorPlugin : null
12/10/28 10:41:15 INFO mapred.MapTask: numReduceTasks: 1
12/10/28 10:41:15 INFO mapred.MapTask: io.sort.mb = 100
12/10/28 10:41:15 INFO mapred.MapTask: data buffer = 79691776/99614720
12/10/28 10:41:15 INFO mapred.MapTask: record buffer = 262144/327680
12/10/28 10:41:15 INFO mapred.MapTask: Starting flush of map output
12/10/28 10:41:16 INFO mapred.JobClient: map 0% reduce 0%
12/10/28 10:41:17 INFO mapred.MapTask: Finished spill 0
12/10/28 10:41:17 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/10/28 10:41:18 INFO mapred.LocalJobRunner: file:/export/hadoop-1.0.1/bin/input/paths.txt:0+156
12/10/28 10:41:18 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/10/28 10:41:18 INFO mapred.Task: Using ResourceCalculatorPlugin : null
12/10/28 10:41:18 INFO mapred.LocalJobRunner:
12/10/28 10:41:18 INFO mapred.Merger: Merging 1 sorted segments
12/10/28 10:41:18 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 199 bytes
12/10/28 10:41:18 INFO mapred.LocalJobRunner:
12/10/28 10:41:19 INFO mapred.JobClient: map 100% reduce 0%
12/10/28 10:41:19 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/10/28 10:41:19 INFO mapred.LocalJobRunner:
12/10/28 10:41:19 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/10/28 10:41:19 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/export/hadoop-1.0.1/bin/output
12/10/28 10:41:21 INFO mapred.LocalJobRunner: reduce > reduce
12/10/28 10:41:21 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
12/10/28 10:41:22 INFO mapred.JobClient: map 100% reduce 100%
12/10/28 10:41:22 INFO mapred.JobClient: Job complete: job_local_0001
12/10/28 10:41:22 INFO mapred.JobClient: Counters: 18
12/10/28 10:41:22 INFO mapred.JobClient: File Input Format Counters
12/10/28 10:41:22 INFO mapred.JobClient: Bytes Read=156
12/10/28 10:41:22 INFO mapred.JobClient: File Output Format Counters
12/10/28 10:41:22 INFO mapred.JobClient: Bytes Written=177
12/10/28 10:41:22 INFO mapred.JobClient: FileSystemCounters
12/10/28 10:41:22 INFO mapred.JobClient: FILE_BYTES_READ=9573
12/10/28 10:41:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=73931
12/10/28 10:41:22 INFO mapred.JobClient: Map-Reduce Framework
12/10/28 10:41:22 INFO mapred.JobClient: Reduce input groups=4
12/10/28 10:41:22 INFO mapred.JobClient: Map output materialized bytes=203
12/10/28 10:41:22 INFO mapred.JobClient: Combine output records=4
12/10/28 10:41:22 INFO mapred.JobClient: Map input records=1
12/10/28 10:41:22 INFO mapred.JobClient: Reduce shuffle bytes=0
12/10/28 10:41:22 INFO mapred.JobClient: Reduce output records=4
12/10/28 10:41:22 INFO mapred.JobClient: Spilled Records=8
12/10/28 10:41:22 INFO mapred.JobClient: Map output bytes=189
12/10/28 10:41:22 INFO mapred.JobClient: Total committed heap usage (bytes)=321527808
12/10/28 10:41:22 INFO mapred.JobClient: Map input bytes=156
12/10/28 10:41:22 INFO mapred.JobClient: Combine input records=0
12/10/28 10:41:22 INFO mapred.JobClient: Map output records=4
12/10/28 10:41:22 INFO mapred.JobClient: SPLIT_RAW_BYTES=98
12/10/28 10:41:22 INFO mapred.JobClient: Reduce input records=0
@ChrisWhite我跑PROGRAMM在執行程序的文件與命令
./hadoop jar /export/hadoop-1.0.1/bin/ParallelIndexation.jar org.myorg.ParallelIndexation /export/hadoop-1.0.1/bin/input /export/hadoop-1.0.1/bin/output -D mapred.map.tasks=1 1> resultofexecute.txt 2&>1
的幫助我的集羣中有4個節點,其中一個主,一個用於secondarynamenode和2個下屬。
請不要使用DataInputStream來讀取文本文件。你不需要它,所以請刪除它。 – 2013-04-24 07:41:39
@PeterLawrey爲什麼? – user2306966 2013-04-24 08:24:10
DataInputStream是多餘的,但是這個壞例子在堆棧溢出時每月複製30次。它令我痛苦,因爲它是錯誤的,一直是錯誤的,有時會導致錯誤,這完全是可以避免的。 – 2013-04-24 08:45:37