2014-02-25 59 views
0

我是WSJ的付費會員。我想使用HtmlUnit登錄WSJ,但無法這樣做。以下是我的代碼:無法使用HtmlUnit/HttpClient登錄WSJ

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24); 
    webClient.getOptions().setJavaScriptEnabled(true); 
    webClient.getOptions().setCssEnabled(false); 
    webClient.getOptions().setRedirectEnabled(true); 
    webClient.getOptions().setThrowExceptionOnScriptError(false); 
    webClient.setAjaxController(new NicelyResynchronizingAjaxController()); 
    webClient.getCookieManager().setCookiesEnabled(true); 


    final HtmlPage page1 = WebClient.getPage("https://id.wsj.com/access/50f57264bd7fb2d2f6629af6/latest/login_standalone.html"); 
    final HtmlForm form = page1.getForms().get(0); 

    final HtmlTextInput textField = form.getInputByName("username"); 
    final HtmlPasswordInput pwd = form.getInputByName("password");   
    textField.setValueAttribute("xxxxx"); 
    pwd.setValueAttribute("xxxx"); 

    final HtmlSubmitInput button = (HtmlSubmitInput) form.getInputsByValue("Log In").get(0); 
    final HtmlPage page2 = button.click(); 

我不知道我缺少其中.. 早些時候,我使用的Apache HttpClient的,但仍然沒有sucess。

的HttpClient代碼:

CloseableHttpClient httpclient = HttpClientBuilder.create().build(); 
    CookieStore cookieStore = new BasicCookieStore(); 
    HttpContext httpContext = new BasicHttpContext(); 
    httpContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore); 
    HttpPost httpGet = new HttpPost("https://id.wsj.com/access/50f57264bd7fb2d2f6629af6/latest/login_standalone.html"); 
    httpGet.setHeader("Content-type", "application/json"); 
    httpGet.setHeader("Accept-Encoding","gzip, deflate"); 
    httpGet.setHeader("Host","id.wsj.com"); 
    httpGet.setHeader("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0"); 
    httpGet.setHeader("X-HTTP-Method-Override","POST"); 
    httpGet.setHeader("X-Requested-With","XMLHttpRequest"); 

    List<NameValuePair> urlParameters = new ArrayList<NameValuePair>(); 

    urlParameters.add(new BasicNameValuePair("landing_page", "http%3A%2F%2Findia.wsj.com%2F")); 
    urlParameters.add(new BasicNameValuePair("realm", "default")); 
    urlParameters.add(new BasicNameValuePair("template", "default")); 
    urlParameters.add(new BasicNameValuePair("username", "xxxx")); 
    urlParameters.add(new BasicNameValuePair("password", "xxxx")); 
    urlParameters.add(new BasicNameValuePair("savelogin", "true")); 

    httpGet.setEntity(new UrlEncodedFormEntity(urlParameters)); 

    HttpResponse response1 = httpclient.execute(httpGet, httpContext); 

    System.out.println(response1.getStatusLine().getStatusCode()); 

    HttpGet getRequest = new HttpGet("http://online.wsj.com/news/articles/SB10001424052702304834704579404391984581058?mod=WSJ_LatestHeadlines&mg=reno64-wsj"); 

    response1 = httpclient.execute(getRequest, httpContext); 
    StringWriter writer = new StringWriter(); 
    IOUtils.copy(response1.getEntity().getContent(), writer, "UTF-8"); 
    String theString = writer.toString(); 
    FileWriter fileWriter = new FileWriter("C:/Users/xxxsx/Desktop/xx.html"); 
    fileWriter.write(theString); 
    fileWriter.close(); 

請幫傢伙?

夥計們終於用Selenium登錄了!

+0

使用HTMLUNIT時,你有沒有收到任何異常?或者可以ü粘貼用戶名,密碼和點擊按鈕html代碼 – Kick

+1

沒有先生,沒有產生異常。不,我不能讓用戶/通過公開 –

+0

我dnt問憑證,只是再次閱讀我問html代碼。我有一個問題,當我輸入虛擬用戶名\密碼,並單擊按鈕沒有行動發生?該頁面如何工作,在這種情況下,必須輸入錯誤的用戶名/密碼信息。 – Kick

回答

0

愉快地使用Selenium登錄成功!