如何解析LinkedIn頁面

可能有人幫助我如何解析這個鏈接的curl？如何解析LinkedIn頁面

https://www.linkedin.com/in/williamhgates/

這是我的代碼：

只要運行它，看看結果：

$url = "https://www.linkedin.com/in/williamhgates/"; 
$ch = curl_init($url); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: www.linkedin.com/in/williamhgates/')); 
$output = curl_exec($ch); 
curl_close($ch);dd($output);die;

我只是想獲得全部源的一個文件，但它顯示了：

Could not process this client request HTTP method request for URL

來源

2017-02-15 Hamed

也許它缺少像useragents頭。 – Jer

我用html_dom和curl嘗試不同的方法，但是它們都不起作用！ – Hamed

如果您在php.ini中啓用了file_get_html，則可以使用'$ html = file_get_html（https://www.linkedin.com/in/williamhgates/'）;'並使用DOM類從中提取數據。 –

鏈接在不允許所有爬蟲除了一些whic h發送流量（Google bot，Bingbot等），他們已經明確阻止了用戶代理。所以不可能向Linkedin頁面發出curl請求。但無論如何，你設法抓取Linkedin，它可能會給你造成法律問題。因此，最好讓Linkedin獨自一人。

來源

2017-03-23 06:44:36 Shiva

標準方式似乎不再適用於LinkedIn。
即使將LinkedIn頁面放入iframe也不起作用。你會得到一個迴應，說Load denied by X-Frame-Options: https://www.linkedin.com does not permit cross-origin framing.
飼料43用於工作，直到大約5個星期前，現在它得到一個HTTP/1.1 999 Request denied響應。

有一個官方LinkedIn插件讓你的網站LinkedIn的用戶配置文件小部件 - https://developer.linkedin.com/plugins/member-profile

還有一些其他插件，以及 - https://developer.linkedin.com/plugins

但是，這是你所得到的這些日子。

來源

2017-05-17 22:01:26 somepaulo

如何解析LinkedIn頁面

回答

相關問題