我遇到一個多線程的問題與HttpClient的,我有以下情形:多線程問題與HttpClient的
線程A會發出URL http://blap.com?param=2
線程B會發出URL http://blap.com?param=3
這約98%的時間工作,但偶爾線程A將收到線程B的url的數據,反之亦然。
現在每個線程都創建它自己的HttpClient實例,所以我在理論上認爲我不需要使用MultiThreadedHttpConnectionManager。
我描述的行爲看起來似乎是合理的,它會通過使用MultiThreadedHttpConnectionManager修復嗎?
我使用java 1.6和apache http客戶端組件4.0.3。
更新: 這裏是有問題的功能。
public void get_url(String strDataSet) throws SQLException, MalformedURLException, IOException
{
String query;
query = "select * from jobs where data_set='" + strDataSet + "'";
ResultSet rs2 = dbf.db_run_query(query);
rs2.next();
HttpClient httpclient = new DefaultHttpClient();
HttpResponse response;
String strURL;
strURL = rs2.getString("url_static");
if (rs2.getString("url_dynamic")!=null && !rs2.getString("url_dynamic").isEmpty())
strURL = strURL.replace("${dynamic}", rs2.getString("url_dynamic"));
UtilityFunctions.stdoutwriter.writeln("Retrieving URL: " + strURL,Logs.STATUS2,"DG25");
if (!strURL.contains(":"))
UtilityFunctions.stdoutwriter.writeln("WARNING: url is not preceeded with a protocol" + strURL,Logs.STATUS1,"DG25.5");
//HttpGet chokes on the^character
strURL = strURL.replace("^","%5E");
HttpGet httpget = new HttpGet(strURL);
/*
* The following line fixes an issue where a non-fatal error is displayed about an invalid cookie data format.
* It turns out that some sites generate a warning with this code, and others without it.
* I'm going to kludge this for now until I get more data on which urls throw the
* warning and which don't.
*
* warning with code: www.exchange-rates.org
*/
if (!(strCurDataSet.contains("xrateorg") || strCurDataSet.contains("google") || strCurDataSet.contains("mwatch")))
{
httpget.getParams().setParameter("http.protocol.cookie-datepatterns",
Arrays.asList("EEE, dd MMM-yyyy-HH:mm:ss z", "EEE, dd MMM yyyy HH:mm:ss z"));
}
response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
BufferedReader in = new BufferedReader(
new InputStreamReader(
entity.getContent()));
int nTmp;
returned_content="";
while ((nTmp = in.read()) != -1)
returned_content = returned_content + (char)nTmp;
in.close();
httpclient.getConnectionManager().shutdown();
UtilityFunctions.stdoutwriter.writeln("Done reading url contents",Logs.STATUS2,"DG26");
}
更新: 我將問題範圍縮小到行:
response = httpclient.execute(httpget);
如果我把周圍的線螺紋鎖,問題就走了。事情是,這是最耗時的部分,我不希望只有一個線程能夠一次檢索http數據。
這種方法看起來好像它一次處理大量的非連接事件:從數據庫中獲取URL,驗證該URL,從相應的HTTP連接中讀取數據。您是否考慮將其重構爲幾個類,以便於維護和單元測試?順便說一句,如果'rs.next()'返回'false'呢?在當前的代碼競爭條件可能在任何地方,即使在數據庫級別。 – 2011-04-18 20:09:48