2017-08-13 51 views
1

我正在編寫一個例程,它將從文件中檢索URL列表,使用JSoup獲取每個URL的內容,查找某些模式並將結果寫入輸出文件(一個用於分析每個URL)。在Java中使用ExecutorService和Callable編寫文件不起作用

我有一個WebPageAnalysisTask(實現可贖回)和現在它返回null,但它會返回保存處理結果的對象(做):

public WebPageAnalyzerTask(String targetUrl, Pattern searchPattern) { 
    this.targetUrl = targetUrl; 
    this.searchPattern = searchPattern; 
} 

@Override 
public WebPageAnalysisTaskResult call() throws Exception { 
    long startTime = System.nanoTime(); 
    String htmlContent = this.getHtmlContentFromUrl(); 
    List<String> resultContent = this.getAnalysisResults(htmlContent); 

    try (BufferedWriter bw = Files.newBufferedWriter(Paths.get("c:/output", UUID.randomUUID().toString() + ".txt"), 
      StandardCharsets.UTF_8, StandardOpenOption.WRITE)) { 
     bw.write(parseListToLine(resultContent)); 
    } 

    long endTime = System.nanoTime(); 
    return null; 
} 

我寫該文件使用NIO並嘗試使用資源。

將使用該任務的代碼如下:

/** 
* Starts the analysis of the Web Pages retrieved from the input text file using the provided pattern. 
*/ 
public void startAnalysis() { 
    List<String> urlsToBeProcessed = null; 

    try (Stream<String> stream = Files.lines(Paths.get(this.inputPath))) { 

     urlsToBeProcessed = stream.collect(Collectors.toList()); 

     if (urlsToBeProcessed != null && urlsToBeProcessed.size() > 0) { 
      List<Callable<WebPageAnalysisTaskResult>> pageAnalysisTasks = this 
        .buildPageAnalysisTasksList(urlsToBeProcessed); 
      ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE); 
      List<Future<WebPageAnalysisTaskResult>> results = executor.invokeAll(pageAnalysisTasks); 
      executor.shutdown(); 
     } else { 
      throw new NoContentToProcessException(); 
     } 

    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 

/** 
* Builds a list of tasks in which each task will be filled with data required for the analysis processing. 
* @param urlsToBeProcessed The list of URLs to be processed. 
* @return A list of tasks that must be handled by an executor service for asynchronous processing. 
*/ 
private List<Callable<WebPageAnalysisTaskResult>> buildPageAnalysisTasksList(List<String> urlsToBeProcessed) { 
    List<Callable<WebPageAnalysisTaskResult>> tasks = new ArrayList<>(); 
    UrlValidator urlValidator = new UrlValidator(ALLOWED_URL_SCHEMES); 

    urlsToBeProcessed.forEach(urlAddress -> { 
     if (urlValidator.isValid(urlAddress)) { 
      tasks.add(new WebPageAnalyzerTask(urlAddress, this.targetPattern)); 
     } 
    }); 

    return tasks; 
} 

文件保存的URL列表被讀取一次。 ExecutorService爲每個URL創建任務,並將異步分析和寫入結果文件。

現在正在讀取文件,並且每個URL的HTML內容正在被分析並保存在一個字符串中。但是,該任務不是寫入文件。所以我想知道那裏會發生什麼。

有人可以告訴我,如果我錯過了什麼嗎?

在此先感謝。

+0

您正在使用'java.io.BufferedWriter'寫入文件,而不是NIO。 – EJP

回答

1

也許你在下面try塊得到一個例外

try (BufferedWriter bw = Files.newBufferedWriter(Paths.get("c:/output", UUID.randomUUID().toString() + ".txt"), 
     StandardCharsets.UTF_8, StandardOpenOption.WRITE)) { 
    bw.write(parseListToLine(resultContent)); 
} 

嘗試將catch塊添加到它,並打印異常,如果事情確實發生,看看是什麼原因呢

catch (IOException e) { 
    // Replace with logger or some kind of error handling in production code 
    e.printStackTrace(); 
} 
+0

我的恥辱!我完全忘記了catch塊,之後我發現在嘗試在不存在的目錄中寫入文件時發生IOException。謝謝! –

1

由於該任務將在類WebPageAnalyzerTask中的方法call()上運行錯誤,所以您應該檢查List<Future<WebPageAnalysisTaskResult>> results = executor.invokeAll(pageAnalysisTasks);的結果並確定任務運行時發生了什麼錯誤。

for (Future<WebPageAnalysisTaskResult> future : results) { 
     try { 
      future.get(); 
     } catch (InterruptedException e) { 
      e.printStackTrace(); 
     } catch (ExecutionException e) { 
      e.printStackTrace(); 
     } 
    }