2016-09-18 73 views
2

我有這個API來解析。Jsoup從Web瀏覽器返回不同的輸出

https://data.studentedge.com.au/api/comments/getpage?page=1&sort=Oldest&url=%2Fforums%2Fdetails%2Fany-surfers-out-there

當我瀏覽器,網頁瀏覽器(帶或不支持JavaScript) 它返回:

{"Items":[{"CommentBody":"<p>I ride a 5'9 and am from the mid north coast</p>\r\n\r\n","MemberName":"Jack F","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/21450f07-ddcc-4f19-8cba-296f22e84ee1.jpeg","PostDate":"2016-09-03T01:38:38+00:00","CommentId":"f1c50066-69b3-4a92-bc0c-a676001b174f","ParentId":null,"PosterId":"28936bc3-f705-45d6-8f94-a5b0004585c6","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":0,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false},{"CommentBody":"<p>I surf everyday on Google Chrome - SA here ;)</p>\r\n\r\n","MemberName":"Bryan A","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/02a713ee-2ca1-4029-85f8-314878386621.png","PostDate":"2016-09-09T10:36:47+00:00","CommentId":"689460a2-4b02-4ca7-851c-a67c00aee6ab","ParentId":null,"PosterId":"5192fcf7-703b-4f78-b6fd-a3a000427119","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":1,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false},{"CommentBody":"<p>Same... Chrome's the only thing I surf....</p>\r\n<p>My mate goes 5'10&quot; and also snowboards...</p>\r\n\r\n","MemberName":"Sandy S","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/6542e863-3f04-496d-b4aa-d6adcb16ca39.jpg","PostDate":"2016-09-09T10:51:40+00:00","CommentId":"9479d9f2-845a-48a5-8d28-a67c00b2fcd7","ParentId":"689460a2-4b02-4ca7-851c-a67c00aee6ab","PosterId":"165dc3d0-9e3d-484f-b5be-a3a100cfc691","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":0,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false}],"PageNumber":1,"Order":"Oldest"} 

它是完美的JSON。 但是,當我使用Jsoup它返回。

<html> <head></head> <body> {"Items":[{"CommentBody":" <p>I ride a 5'9 and am from the mid north coast</p>\r\n\r\n","MemberName":"Jack F","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/21450f07-ddcc-4f19-8cba-296f22e84ee1.jpeg","PostDate":"2016-09-03T01:38:38+00:00","CommentId":"f1c50066-69b3-4a92-bc0c-a676001b174f","ParentId":null,"PosterId":"28936bc3-f705-45d6-8f94-a5b0004585c6","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":0,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false},{"CommentBody":" <p>I surf everyday on Google Chrome - SA here ;)</p>\r\n\r\n","MemberName":"Bryan A","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/02a713ee-2ca1-4029-85f8-314878386621.png","PostDate":"2016-09-09T10:36:47+00:00","CommentId":"689460a2-4b02-4ca7-851c-a67c00aee6ab","ParentId":null,"PosterId":"5192fcf7-703b-4f78-b6fd-a3a000427119","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":1,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false},{"CommentBody":" <p>Same... Chrome's the only thing I surf....</p>\r\n <p>My mate goes 5'10" and also snowboards...</p>\r\n\r\n","MemberName":"Sandy S","AvatarUrl":"https://studentedgeapplication.blob.core.windows.net/profiles/6542e863-3f04-496d-b4aa-d6adcb16ca39.jpg","PostDate":"2016-09-09T10:51:40+00:00","CommentId":"9479d9f2-845a-48a5-8d28-a67c00b2fcd7","ParentId":"689460a2-4b02-4ca7-851c-a67c00aee6ab","PosterId":"165dc3d0-9e3d-484f-b5be-a3a100cfc691","Status":"Approved","CurrentMemberComment":false,"UpvoteCount":0,"MemberHasUpvoted":false,"PageUrl":"/forums/details/any-surfers-out-there","IsModerator":false}],"PageNumber":1,"Order":"Oldest"} </body></html> 

JSOUP代碼:

Document doc = Jsoup.connect(baseUrl + keyword) 
      .followRedirects(true) 
      .ignoreContentType(true) 
      .userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:43.0) Gecko/20100101 Firefox/43.0") 
      .header("Accept-Encoding", "gzip, deflate") 
      .header("Accept-Language", "en-US,en;q=0.5") 
      .header("Host", "data.studentedge.com.au") 
      .header("Origin", "https://studentedge.com.au") 
      .header("Referer", "https://studentedge.com.au/forums/details/any-surfers-out-there") 
      .get(); 
    String result = doc.html(); 

注:如果我使用doc.text()它在某種程度上打破了JSON。

回答

3

使用executebody得到的原始數據:

String result = Jsoup.connect(baseUrl + keyword) 
      .followRedirects(true) 
      .ignoreContentType(true) 
      .userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:43.0) Gecko/20100101 Firefox/43.0") 
      .header("Accept-Encoding", "gzip, deflate") 
      .header("Accept-Language", "en-US,en;q=0.5") 
      .header("Host", "data.studentedge.com.au") 
      .header("Origin", "https://studentedge.com.au") 
      .header("Referer", "https://studentedge.com.au/forums/details/any-surfers-out-there") 
      .execute().body();