2015-04-01 74 views
1

我在Hadoop上運行MapReduce程序。HDFS有文件,但java.io.FileNotFoundException發生

inputformat將每個文件路徑傳遞給映射器。

我可以檢查通過在cmd這樣的文件,

$ FS的Hadoop HDFS -ls://slave1.kdars.com:8020 /用戶/ Hadoop的/ num_5/13.pdf

找到1 items -rwxrwxrwx 3 hdfs hdfs 184269 2015-03-31 22:50 hdfs://slave1.kdars.com:8020/user/hadoop/num_5/13.pdf

但是,當我試圖從映射器方面,它不工作。

15/04/01 6點13分04秒INFO mapreduce.Job:任務標識:attempt_1427882384950_0025_m_000002_2,狀態:FAILED 錯誤:java.io.FileNotFoundException:HDFS:/slave1.kdars.com:8020 /用戶/ hadoop的/num_5/13.pdf(沒有這樣的文件或目錄)

at java.io.FileInputStream.open(Native Method) 
at java.io.FileInputStream.<init>(FileInputStream.java:146) 
at java.io.FileInputStream.<init>(FileInputStream.java:101) 
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1111) 

我檢查了inputformat做工精細和映射器已經得到正確的文件路徑。 映射器代碼看起來像這樣,

@Override 
public void map(Text title, Text file, Context context) throws IOException, InterruptedException { 

    long time = System.currentTimeMillis(); 
    SimpleDateFormat dayTime = new SimpleDateFormat("yyyy-mm-dd hh:mm:ss"); 
    String str = dayTime.format(new Date(time)); 

    File temp = new File(file.toString()); 
    if(temp.exists()){ 
     DBManager.getInstance().insertSQL("insert into `plagiarismdb`.`workflow` (`type`) value ('"+temp+" is exists')"); 
    }else{ 
     DBManager.getInstance().insertSQL("insert into `plagiarismdb`.`workflow` (`type`) value ('"+temp+" is not exists')"); 
    } 
} 

請幫助我。

回答

1

在你的mapper方法中試試這個。 先導入這些。 import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path;

在您的映射器方法中使用。

 FileSystem fs = FileSystem.get(new Configuration()); 

     Path path= new Path(value.toString()); 
     System.out.println(path); 
     if(fs.exists(path)) 
     context.write(value, one); 
     else 
      context.write(value, zero); 
+0

非常感謝你!!!!它解決了這個問題! – ShineH 2015-04-01 12:45:00

0
package com.tcb; 

import java.io.BufferedReader; 
import java.io.FileNotFoundException; 
import java.io.FileReader; 
import java.io.IOException; 
import java.net.URI; 
import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.fs.FileSystem; 
import org.apache.hadoop.fs.Path; 
import com.google.gson.JsonArray; 
import com.google.gson.JsonObject; 

public class YellowTaxi { 

    public static void main(String[] args) throws IOException { 

     // Creation of the Hadoop HDFS JAVA API 
     String hdfsPath = "hdfs://localhost:8020/user/YellowTaxi/yellowTaxi.csv"; 
     URI uri = URI.create(hdfsPath); 
     // Path path = new Path(uri); // It constitute URI 
     Configuration c = new Configuration(); 
     c.set("fs.default.name", hdfsPath); 
     // Create a file 

     FileSystem fs = FileSystem.get(new Configuration()); 

     Path path = new Path(uri.toString()); 
     // System.out.println(path); 
     System.out.println(fs.exists(path)); 

     if (fs.exists(path)) { 
      System.out.println("I am ok "); 
      // Read the content of the csv file from hdfds 
      BufferedReader br = null; 
      String line = ""; 
      String cvsSplitBy = ","; 
      JsonObject taxiTrips = new JsonObject(); 
      JsonArray array = new JsonArray(); 
      JsonObject item = new JsonObject(); 
      int i = 0; 
      try { 

       br = new BufferedReader(new FileReader(path.toString())); 
       System.out.println("102"); 
       while ((line = br.readLine()) != null) { 

        // use comma as separator 
        String[] word = line.split(cvsSplitBy); 
        for (i = 0; i < word.length - 1; i++) { 
         item.addProperty(word[i], word[i + 1]); 
         array.add(item); 

         taxiTrips.add("Trips", array); 

        } 
       } 

      } catch (FileNotFoundException e) { 
       e.printStackTrace(); 
      } catch (IOException e) { 
       e.printStackTrace(); 
      } finally { 
       if (br != null) { 
        try { 
         br.close(); 
        } catch (IOException e) { 
         e.printStackTrace(); 
        } 
       } 

      } 

     } 
    } 
} 
+5

嗨, 歡迎來到堆棧溢出。在發佈答案時,解釋爲什麼代碼能夠解決原始問題中提出的特定問題,而不是發佈大量未公開的代碼塊是正常的。謝謝。 – Spangen 2018-01-15 13:26:55

相關問題