我正在編寫自己的Pig Store類,我不想將其存儲在文件中,我打算將它發送給某些第三方數據存儲(缺少API調用)。未設置Hadoop Pig輸出目錄
注意:我在Cloudera的VirtualBox映像上運行它。
我寫我的java類(見下表),並創建mystore.jar我使用這下面id.pig腳本:
store B INTO 'mylocation' USING MyStore('mynewlocation')
運行此腳本以豬,同時,我看到下面的錯誤: 錯誤6000: 輸出位置驗證失敗:'file://home/cloudera/test/id.out更多信息如下: 未設置輸出目錄。
or.apache.pig.impl.plan.VisitorException: ERROR 6000:
at or.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileValidator.visit(InputOutputFileValidator.java:95)
請幫忙!
-------------------- MyStore.java ----------------------
public class MyStore extends StoreFunc {
protected RecordWriter writer = null;
private String location = null;
public MyStore() {
location= null;
}
public MyStore (String location) {
this.location= location;
}
@Override
public OutputFormat getOutputFormat() throws IOException {
return new MyStoreOutputFormat(location);
}
@Override
public void prepareToWrite(RecordWriter writer) throws IOException {
this.writer = writer;
}
@Override
public void putNext(Tuple tuple) throws IOException {
//write tuple to location
try {
writer.write(null, tuple.toString());
} catch (InterruptedException e) {
e.printStackTrace();
}
}
@Override
public void setStoreLocation(String location, Job job) throws IOException {
if(location!= null)
this.location= location;
}
}
-------------------- MyStoreOutputFormat.java --------------------- -
import java.io.DataOutputStream;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.RecordWriter;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.pig.data.Tuple;
public class MyStoreOutputFormat extends
TextOutputFormat<WritableComparable, Tuple> {
private String location = null;
public MyStoreOutputFormat(String location) {
this.location = location;
}
@Override
public RecordWriter<WritableComparable, Tuple> getRecordWriter(
TaskAttemptContext job) throws IOException, InterruptedException {
Configuration conf = job.getConfiguration();
String extension = location;
Path file = getDefaultWorkFile(job, extension);
FileSystem fs = file.getFileSystem(conf);
FSDataOutputStream fileOut = fs.create(file, false);
return new MyStoreRecordWriter(fileOut);
}
protected static class MyStoreRecordWriter extends
RecordWriter<WritableComparable, Tuple> {
DataOutputStream out = null;
public MyStoreRecordWriter(DataOutputStream out) {
this.out = out;
}
@Override
public void close(TaskAttemptContext taskContext) throws IOException,
InterruptedException {
// close the location
}
@Override
public void write(WritableComparable key, Tuple value)
throws IOException, InterruptedException {
// write the data to location
if (out != null) {
out.writeChars(value.toString()); // will be calling API later. let me first dump to the location!
}
}
}
}
我錯過了什麼嗎?
請幫忙。我迫切需要它。謝謝! –