我想從一系列字符串中提取數據,但沒有運氣。 在下面的示例代碼中,我嘗試使用preg_split,但它沒有給我我想要的結果。提取源URL和字符串中的錨文本
使用下面的代碼:
<?php
$str = '<a href="http://rads.stackoverflow.com/amzn/click/B008EYEYBA">Nike Air Jordan SC-2 Mens Basketball Shoes 454050-035</a><img src="http://www.assoc-amazon.com/e/ir?t=mytwitterpage-20&l=as2&o=1&a=B008EYEYBA" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
';
$chars = preg_split('/ /', $str, -1, PREG_SPLIT_OFFSET_CAPTURE);
echo '<pre>';
print_r($chars);
echo '<pre>';
?>
給出結果:在陣列1
Array
(
[0] => Array
(
[0] => 0
)
[1] => Array
(
[0] => href="http://rads.stackoverflow.com/amzn/click/B008EYEYBA">Nike
[1] => 3
)
[2] => Array
(
[0] => Air
[1] => 167
)
[3] => Array
(
[0] => Jordan
[1] => 171
)
[4] => Array
(
[0] => SC-2
[1] => 178
)
[5] => Array
(
[0] => Mens
[1] => 183
)
[6] => Array
(
[0] => Basketball
[1] => 188
)
[7] => Array
(
[0] => Shoes
[1] => 199
)
[8] => Array
(
[0] => 454050-035 205
)
[9] => Array
(
[0] => src="http://www.assoc-amazon.com/e/ir?t=mytwitterpage-20&l=as2&o=1&a=B008EYEYBA"
[1] => 224
)
[10] => Array
(
[0] => width="1"
[1] => 305
)
[11] => Array
(
[0] => height="1"
[1] => 315
)
[12] => Array
(
[0] => border="0"
[1] => 326
)
[13] => Array
(
[0] => alt=""
[1] => 337
)
[14] => Array
(
[0] => style="border:none
[1] => 344
)
[15] => Array
(
[0] => !important;
[1] => 363
)
[16] => Array
(
[0] => margin:0px
[1] => 375
)
[17] => Array
(
[0] => !important;"
[1] => 386
)
[18] => Array
(
[0] => />
[1] => 399
)
)
注意到,「耐克包含這個詞的時候我只需要僅僅是URL
[1] => Array
(
[0] => href="http://rads.stackoverflow.com/amzn/click/B008EYEYBA">Nike
[1] => 3
)
。
實際上,我在提取$ str的最終目標只是將源URL和achor文本輸出到單獨的數組中柯這樣:
網址:
錨文本:
耐克喬丹SC-2男子籃球鞋454050-035
任何想法如何,我可以完成這一點非常感謝。
我試過了,它給了我警告:DOMDocument :: loadHTML()[domdocument.loadhtml]:htmlParseEntityRef:期待';'在實體中,行:1在 – anagnam
這是因爲只有整個文檔的一部分被證明。您可以靜音警告或提供完整的文檔。 –
添加了libxml警告抑制... –