2016-12-14 64 views
1

我試圖得到一個非常基本的,簡短的基本無序列表<ul>關閉維基百科。我的最終目標是將其放入DataFrame。 我的問題是,我從哪裏出發?基本的BeautifulSoup維基百科刮

In [28]: from bs4 import BeautifulSoup 

     import urllib2 

     import requests 

     from pandas import Series,DataFrame 

In [29]: url = "https://en.wikipedia.org/wiki/National_Pro_Grid_League" 

In [31]: result = requests.get(url) 

In [32]: c = result.content 

In [33]: soup = BeautifulSoup(c) 

我不能在這個StackOverflow找到任何答案,所以我希望任何人都可以給我的建議。
這就是我要找的具體名單:

Active teams[edit] 
Baltimore Anthem (2015–present) 
Boston Iron (2014–present) 
DC Brawlers (2014–present) 
Los Angeles Reign (2014–present) 
Miami Surge (2014–present) 
New York Rhinos (2014–present) 
Phoenix Rise (2014–present) 
San Francisco Fire (2014–present) 

回答

2

首先,您要查找的頁面的正確部分。您可以通過找到id="Active_teams"的標題,然後從那裏找到下一個<ul>元素來完成此操作。

from bs4 import BeautifulSoup 
import requests 

url = "https://en.wikipedia.org/wiki/National_Pro_Grid_League" 
r = requests.get(url) 
soup = BeautifulSoup(r.content) 

heading = soup.find(id='Active_teams') 
teams = heading.find_next('ul') 
for team in team: 
    print team.string 
+0

謝謝!這工作。我相信我將來只會有更多的問題。 –