的Symfony的DomCrawler沒有找到一個特定的標籤

我使用DomCrawler獲得從谷歌Play的數據頁和它的作品的情況下，99％，除了我偶然發現了一個網頁，它無法找到具體的股利。我檢查了HTML代碼，它肯定存在。我的代碼是的Symfony的DomCrawler沒有找到一個特定的標籤

$autoloader = require __DIR__.'\vendor\autoload.php'; 
use Symfony\Component\DomCrawler\Crawler; 

$app_id = 'com.balintinfotech.sinhalesekeyboardfree'; 

$response = file_get_contents('https://play.google.com/store/apps/details?id='.$app_id); 
$crawler = new Crawler($response); 
echo $crawler->filter('div[itemprop="datePublished"]')->text();

當我運行的特定網頁，我得到

PHP Fatal error: Uncaught InvalidArgumentException: The current node list is empty.

不過，如果我使用任何其他ID，我得到了想要的結果。究竟什麼是關於頁面，打破DomCrawler

來源

2017-09-13 John Baker

這是否只發生這一個頁面，你？我能得到它的工作：'14日馬索德2017'（通過直接複製/粘貼代碼） – ishegg

@ishegg就在這個頁面上。我看到你用西班牙文得到了你的結果，所以這隻會影響英文頁面。 –

@ishegg可以嘗試使用以下URL'的https：//play.google.com/store/apps/details ID = com.balintinfotech.sinhalesekeyboardfree＆HL = en' –

正如你正確地想通了，這並不在英文版本出現，但它在西班牙的人做。

一個不同之處，我可以當場被用戶說නියමයි ඈ評論。似乎有些東西在那裏困擾着Crawler。如果你用一個空字符串替換characted一個null（\x00），它正確地讓你在找什麼：

<?php 
$app_id = 'com.balintinfotech.sinhalesekeyboardfree'; 
$response = file_get_contents('https://play.google.com/store/apps/details?hl=en&id='.$app_id); 
$response = str_replace("\x00", "", $response); 
$crawler = new Symfony\Component\DomCrawler\Crawler($response); 
var_dump($crawler->filter('div[itemprop="datePublished"]')->text()); // string(14) "March 14, 2017"

我會盡量多看這個。

來源

2017-09-13 20:10:01 ishegg

尼斯抓住，我不知道是否是在DomCrawler的錯誤。必須刪除我之前的回覆，因爲編碼爲UTF-8並沒有實際工作。 –

不是。注意它是'file_get_contents（）'，它在找到空字符時截斷結果，'DomCrawler'正在完成它的工作。所以這個問題似乎是在PHP的一面。它甚至可能會更深入。 – ishegg

它不會在我的末尾被截斷。我得到了整個HTML。 –

的Symfony的DomCrawler沒有找到一個特定的標籤

回答

相關問題