首先,我希望我能爲你提供更多的幫助。最後一個不是$ SPORTSBALL或$ MONEY相關的問題! :-)
該網站是邪惡的。它使用需要處理的嵌入式命名空間,這也意味着使用xml2
程序包:
library(rvest)
library(xml2)
isc <- read_html("http://www.cabi.org/isc/datasheet/50069")
ns <- xml_ns(isc)
xml_text(xml_find_all(isc, xpath="//div[@id='toDistributionTable']/table/tbody/tr/td[1]", ns))
## [1] "ASIA" "Azerbaijan"
## [3] "Bhutan" "China"
## [5] "-Tibet" "India"
## [7] "-Delhi" "-Indian Punjab"
## [9] "-Rajasthan" "-Uttar Pradesh"
## [11] "Iran" "Iraq"
## [13] "Israel" "Jordan"
## [15] "Kuwait" "Lebanon"
## [17] "Oman" "Pakistan"
## [19] "Qatar" "Saudi Arabia"
## [21] "Syria" "Turkey"
## [23] "Turkmenistan" "United Arab Emirates"
## [25] "Uzbekistan" "Yemen"
## [27] "AFRICA" "Algeria"
## [29] "Egypt" "Libya"
## [31] "Morocco" "South Africa"
## [33] "Tunisia" "NORTH AMERICA"
## [35] "Mexico" "USA"
## [37] "-Arizona" "-California"
## [39] "-Nevada" "-New Mexico"
## [41] "-Texas" "-Utah"
## [43] "SOUTH AMERICA" "Chile"
## [45] "EUROPE" "Belgium"
## [47] "Cyprus" "Denmark"
## [49] "France" "Greece"
## [51] "Ireland" "Italy"
## [53] "Spain" "Sweden"
## [55] "UK" "-England and Wales"
## [57] "-Scotland" "OCEANIA"
## [59] "Australia" "-Australian Northern Territory"
## [61] "-New South Wales" "-Queensland"
## [63] "-South Australia" "-Tasmania"
## [65] "-Victoria" "-Western Australia"
## [67] "New Zealand"
太棒了,謝謝!這應該有助於我從該網站獲取數據的良好開端。你如何獲得信息進入xml_find_all函數的xpath部分? –
右鍵單擊並在該表上選擇檢查元素後,我將其從開發人員工具中顯示的路徑映射。我可能可以用CSS重新做,但在某些情況下,瞭解一點XPath可以提供幫助。 – hrbrmstr