2016-06-08 73 views
1

嘿,這是以前有,但現在麻煩appying同樣的事情幾乎相同的網頁... 頁=「http://www.imdb.com/genre/action/?ref_=gnr_mn_ac_mpBS4 - 從東西抓住你已信息已經被解析

table = soup.find_all("table", {"class": "results"}) 
    for item in list(table): 
     for info in item.contents[1::2]: 
      info.a.extract() 
      link = info.a['href'] 
      print(link) 
      name = info.text.strip() 
      print(name) 

給我一種解釋上面的代碼嘗試捕獲變量信息中標籤中包含的每個電影的每個頁面的鏈接...並且其中的文本具有每個電影的名稱,但是我得到所有文本。有沒有什麼方法可以獲得這個名字?

感謝球員提前!

回答

1

就只需要拉從標籤的文本TD內與類標題

In [15]: from bs4 import BeautifulSoup 
In [16]: import requests 

In [17]: url = "http://www.imdb.com/genre/action/?ref_=gnr_mn_ac_mp" 

In [18]: soup = BeautifulSoup(requests.get(url,"lxml").content) 

In [19]: for td in soup.select("table.results td.title"): 
    ....:   print(td.a.text) 
    ....:  
X-Men: Apocalypse 
Warcraft 
Captain America: Civil War 
The Do-Over 
Teenage Mutant Ninja Turtles: Out of the Shadows 
The Angry Birds Movie 
The Nice Guys 
Batman v Superman: Dawn of Justice 
Suicide Squad 
Deadpool 
Gods of Egypt 
Zootopia 
13 Hours: The Secret Soldiers of Benghazi 
Now You See Me 2 
The Brothers Grimsby 
Hardcore Henry 
Monster Trucks 
Independence Day: Resurgence 
Star Trek Beyond 
The Legend of Tarzan 
Deepwater Horizon 
X-Men: Days of Future Past 
Star Wars: The Force Awakens 
X-Men: First Class 
The 5th Wave 

你想幾乎所有的數據與在TD裏面標題類:

enter image description here

所以,如果你想要的輪廓也都需要從span.outline文:

In [24]: for td in soup.select("table.results td.title"): 
    ....:   print(td.a.text) 
    ....:   print(td.select_one("span.outline").text) 
    ....:  
X-Men: Apocalypse 
With the emergence of the world's first mutant, Apocalypse, the X-Men must unite to defeat his extinction level plan. 
Warcraft 
The peaceful realm of Azeroth stands on the brink of war as its civilization faces a fearsome race of... 
Captain America: Civil War 
Political interference in the Avengers' activities causes a rift between former allies Captain America and Iron Man. 
The Do-Over 
Two down-on-their-luck guys decide to fake their own deaths and start over with new identities, only to find the people they're pretending to be are in even deeper trouble. 
Teenage Mutant Ninja Turtles: Out of the Shadows 
As Shredder joins forces with mad scientist Baxter Stockman and henchmen Bebop and Rocksteady to take over the world, the Turtles must confront an even greater nemesis: the notorious Krang. 
The Angry Birds Movie 
Find out why the birds are so angry. When an island populated by happy, flightless birds is visited by mysterious green piggies, it's up to three unlikely outcasts - Red, Chuck and Bomb - to figure out what the pigs are up to. 
The Nice Guys 
A mismatched pair of private eyes investigate the apparent suicide of a fading porn star in 1970s Los Angeles. 
Batman v Superman: Dawn of Justice 
Fearing that the actions of Superman are left unchecked, Batman takes on the Man of Steel, while the world wrestles with what kind of a hero it really needs. 
Suicide Squad 
A secret government agency recruits imprisoned supervillains to execute dangerous black ops missions in exchange for clemency. 
Deadpool 
A former Special Forces operative turned mercenary is subjected to a rogue experiment that leaves him with accelerated healing powers, adopting the alter ego Deadpool. 
Gods of Egypt 
Mortal hero Bek teams with the god Horus in an alliance against Set, the merciless god of darkness, who has usurped Egypt's throne, plunging the once peaceful and prosperous empire into chaos and conflict. 
Zootopia 
In a city of anthropomorphic animals, a rookie bunny cop and a cynical con artist fox must work together to uncover a conspiracy. 
13 Hours: The Secret Soldiers of Benghazi 
During an attack on a U.S. compound in Libya, a security team struggles to make sense out of the chaos. 
Now You See Me 2 
The Four Horsemen resurface and are forcibly recruited by a tech genius to pull off their most impossible heist yet. 
The Brothers Grimsby 
A new assignment forces a top spy to team up with his football hooligan brother. 
Hardcore Henry 
Henry is resurrected from death with no memory, and he must save his wife from a telekinetic warlord with a plan to bio-engineer soldiers. 
Monster Trucks 
Looking for any way to get away from the life and town he was born into, Tripp (Lucas Till), a high school senior... 
Independence Day: Resurgence 
Two decades after the first Independence Day invasion, Earth is faced with a new extra-Solar threat. But will mankind's new space defenses be enough? 
Star Trek Beyond 
The USS Enterprise crew explores the furthest reaches of uncharted space, where they encounter a mysterious new enemy who puts them and everything the Federation stands for to the test. 
The Legend of Tarzan 
Tarzan, having acclimated to life in London, is called back to his former home in the jungle to investigate the activities at a mining encampment. 
Deepwater Horizon 
A story set on the offshore drilling rig Deepwater Horizon, which exploded during April 2010 and created the worst oil spill in U.S. history. 
X-Men: Days of Future Past 
The X-Men send Wolverine to the past in a desperate effort to change history and prevent an event that results in doom for both humans and mutants. 
Star Wars: The Force Awakens 
Three decades after the defeat of the Galactic Empire, a new threat arises. The First Order attempts to rule the galaxy and only a ragtag group of heroes can stop them, along with the help of the Resistance. 
X-Men: First Class 
In 1962, the United States government enlists the help of Mutants with superhuman abilities to stop a malicious dictator who is determined to start World War III. 
The 5th Wave 
Four waves of increasingly deadly alien attacks have left most of Earth decimated. Cassie is on the run, desperately trying to save her younger brother. 

運行時td.select_one("span.runtime").text等。

+0

哦,我的上帝香港專業教育學院在td.a.text一直把()和它一直在告訴我,它有沒有屬性文字,一直在絞盡腦汁哈哈。現在怎麼來括號?也非常感謝你 – entercaspa

+0

不用擔心,所有的a都包含在結果表中的標題類的tds中,'table.results td.title'使用css選擇器來只拉那些tds,然後我們只需訪問錨點與'.a'文本是電影標題是在錨文本 –

-1

就像你如何做

info.a['href'] 

得到了鏈接,您還可以通過執行

info.a['title'] 

得到電影的標題希望這是你在找什麼!

相關問題