2011-04-18 86 views
2

我遇到一個多線程的問題與HttpClient的,我有以下情形:多線程問題與HttpClient的

線程A會發出URL http://blap.com?param=2

線程B會發出URL http://blap.com?param=3

這約98%的時間工作,但偶爾線程A將收到線程B的url的數據,反之亦然。

現在每個線程都創建它自己的HttpClient實例,所以我在理論上認爲我不需要使用MultiThreadedHttpConnectionManager。

我描述的行爲看起來似乎是合理的,它會通過使用MultiThreadedHttpConnectionManager修復嗎?

我使用java 1.6和apache http客戶端組件4.0.3。

更新: 這裏是有問題的功能。

public void get_url(String strDataSet) throws SQLException, MalformedURLException, IOException 
{ 

     String query; 



     query = "select * from jobs where data_set='" + strDataSet + "'"; 

     ResultSet rs2 = dbf.db_run_query(query); 
     rs2.next(); 


     HttpClient httpclient = new DefaultHttpClient(); 
     HttpResponse response; 



      String strURL; 
      strURL = rs2.getString("url_static"); 

      if (rs2.getString("url_dynamic")!=null && !rs2.getString("url_dynamic").isEmpty()) 
       strURL = strURL.replace("${dynamic}", rs2.getString("url_dynamic")); 

      UtilityFunctions.stdoutwriter.writeln("Retrieving URL: " + strURL,Logs.STATUS2,"DG25"); 

      if (!strURL.contains(":")) 
       UtilityFunctions.stdoutwriter.writeln("WARNING: url is not preceeded with a protocol" + strURL,Logs.STATUS1,"DG25.5"); 

      //HttpGet chokes on the^character 

      strURL = strURL.replace("^","%5E"); 


      HttpGet httpget = new HttpGet(strURL); 


      /* 
      * The following line fixes an issue where a non-fatal error is displayed about an invalid cookie data format. 
      * It turns out that some sites generate a warning with this code, and others without it. 
      * I'm going to kludge this for now until I get more data on which urls throw the 
      * warning and which don't. 
      * 
      * warning with code: www.exchange-rates.org 
      */ 


       if (!(strCurDataSet.contains("xrateorg") || strCurDataSet.contains("google") || strCurDataSet.contains("mwatch"))) 
       { 
        httpget.getParams().setParameter("http.protocol.cookie-datepatterns", 
          Arrays.asList("EEE, dd MMM-yyyy-HH:mm:ss z", "EEE, dd MMM yyyy HH:mm:ss z")); 
       } 







      response = httpclient.execute(httpget); 




     HttpEntity entity = response.getEntity(); 

      BufferedReader in = new BufferedReader(
        new InputStreamReader(
        entity.getContent())); 



     int nTmp;   

     returned_content=""; 




     while ((nTmp = in.read()) != -1) 
     returned_content = returned_content + (char)nTmp; 


     in.close(); 

     httpclient.getConnectionManager().shutdown(); 

     UtilityFunctions.stdoutwriter.writeln("Done reading url contents",Logs.STATUS2,"DG26"); 



} 

更新: 我將問題範圍縮小到行:

response = httpclient.execute(httpget); 

如果我把周圍的線螺紋鎖,問題就走了。事情是,這是最耗時的部分,我不希望只有一個線程能夠一次檢索http數據。

+0

這種方法看起來好像它一次處理大量的非連接事件:從數據庫中獲取URL,驗證該URL,從相應的HTTP連接中讀取數據。您是否考慮將其重構爲幾個類,以便於維護和單元測試?順便說一句,如果'rs.next()'返回'false'呢?在當前的代碼競爭條件可能在任何地方,即使在數據庫級別。 – 2011-04-18 20:09:48

回答

0

您的代碼不是線程安全的。要解決您的直接問題,您需要將HttpClient聲明爲ThreadLocal,但還有很多需要解決的問題。

+0

我沒有看到使HttpClient threadlocal的好處。 Threadlocal在線程中使其成爲全局的,但這是使用該變量的唯一方法。還有什麼不是線程安全的任何細節也將不勝感激。 – opike 2011-04-18 21:03:11

0

你需要在每個線程創建一個新的HttpContext,並把它傳遞給HttpClient.execute:

HttpContext localContext = new BasicHttpClient(); 
response = httpclient.execute(httpget, localContext); 

請參見本文檔的底部(從HttpClient的4):

http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html

還有一個線程安全的HttpContext實現(SyncBasicHttpContext),但我不確定在這種情況下是否需要它。