5
比方說,我用下面的表達R,右的XPath表達式中使用時,XML和xpathSApply
library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)
如果我運行下面的代碼解析了一個網站,
xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
我會得到如下 -
[1] "Description" "What's new"
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"
現在,我感興趣的只是「客戶還安裝」部分。但是,當我運行下面的代碼,
xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
它吐出全部列入「由King.com所有應用更多應用程序的」應用程序「客戶還看」和「客戶還安裝」。
所以,我想,
xpathSApply(url.df_1, "//div[h3='Customers Also Installed']」, function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
但這並沒有工作。所以我試了
xpathSApply(url.df_1, "//div[contains(.,'Customers Also Installed')]",xmlValue)
但是這也行不通。 (輸出應該像下面這樣)
[,1]
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2\n Terry Paton\n "
[2,] "/app/android/com.terrypaton.unity.pogz2/"
任何指導將非常感謝!
+1!很好的問題。可重現的,你展示了你到目前爲止嘗試過的東西。 – agstudy 2013-04-04 08:47:40