2014-09-23 69 views
0

我最近嘗試使用BeautifulSoup從this question以下的Python代碼,這似乎適用於提問者。運行BeautifulSoup時沒有輸出Python代碼

import urllib2 
import bs4 
import string 
from bs4 import BeautifulSoup 

badwords = set([ 
    'cup','cups', 
    'clove','cloves', 
    'tsp','teaspoon','teaspoons', 
    'tbsp','tablespoon','tablespoons', 
    'minced' 
]) 

def cleanIngred(s): 

    s=s.strip() 

    s=s.strip(string.digits + string.punctuation) 

    return ' '.join(word for word in s.split() if not word in badwords) 

def cleanIngred(s): 
    # remove leading and trailing whitespace 
    s = s.strip() 
    # remove numbers and punctuation in the string 
    s = s.strip(string.digits + string.punctuation) 
    # remove unwanted words 
    return ' '.join(word for word in s.split() if not word in badwords) 

def main(): 
    url = "http://allrecipes.com/Recipe/Slow-Cooker-Pork-Chops-II/Detail.aspx" 
    data = urllib2.urlopen(url).read() 
    bs = BeautifulSoup.BeautifulSoup(data) 

    ingreds = bs.find('div', {'class': 'ingredients'}) 
    ingreds = [cleanIngred(s.getText()) for s in ingreds.findAll('li')] 

    fname = 'PorkRecipe.txt' 
    with open(fname, 'w') as outf: 
     outf.write('\n'.join(ingreds)) 

if __name__=="__main__": 
    main() 

我不能讓它在我的情況下工作,雖然由於某種原因。我收到錯誤:

AttributeError       Traceback (most recent call last) 
<ipython-input-4-55411b0c5016> in <module>() 
    41 
    42 if __name__=="__main__": 
---> 43  main() 

<ipython-input-4-55411b0c5016> in main() 
    31  url = "http://allrecipes.com/Recipe/Slow-Cooker-Pork-Chops-II/Detail.aspx" 
    32  data = urllib2.urlopen(url).read() 
---> 33  bs = BeautifulSoup.BeautifulSoup(data) 
    34 
    35  ingreds = bs.find('div', {'class': 'ingredients'}) 

AttributeError: type object 'BeautifulSoup' has no attribute 'BeautifulSoup' 

我懷疑這是因爲我使用bs4而不是BeautifulSoup。我嘗試用bs = bs4.BeautifulSoup(data)替換行bs = BeautifulSoup.BeautifulSoup(data),不再收到錯誤,但沒有輸出。這種猜測有太多可能的原因嗎?

+2

他們'進口BeautifulSoup',你'從BS4進口BeautifulSoup'。你應該使用'bs = BeautifulSoup(data)',或'import bs4',然後'bs = bs4.BeautifulSoup(data)'。 – jonrsharpe 2014-09-23 15:34:02

回答

1

使用的原代碼BeautifulSoup版本3:

import BeautifulSoup 

您切換到BeautifulSoup版本4,但也切換進口的風格:

from bs4 import BeautifulSoup 

要麼刪除了這一行;你已經有了正確的進口早些時候在你的文件:

import bs4 

,然後使用:

bs = bs4.BeautifulSoup(data) 

或更改後者行:

bs = BeautifulSoup(data) 

(並刪除import bs4線) 。

您可能還需要審查BeautifulSoup文件的Porting code to BS4 section,這樣就可以讓你發現升級代碼的任何其它必要的修改,以獲得最佳出BeautifulSoup版本4.

該腳本,否則工作得很好並生成一個新文件PorkRecipe.txt,它不會在stdout上生成輸出。

固定bs4.BeautifulSoup參考後,該文件的內容:

READY IN 4+ hrs 

Slow Cooker Pork Chops II 

Amazing Pork Tenderloin in the Slow Cooker 

Jerre's Black Bean and Pork Slow Cooker Chili 

Slow Cooker Pulled Pork 

Slow Cooker Sauerkraut Pork Loin 

Slow Cooker Texas Pulled Pork 

Oven-Fried Pork Chops 

Pork Chops for the Slow Cooker 

Tangy Slow Cooker Pork Roast 

Types of Cooking Oil 

Garlic: Fresh Vs. Powdered 

All about Paprika 

Types of Salt 
olive oil 
chicken broth 
garlic, 
paprika 
garlic powder 
poultry seasoning 
dried oregano 
dried basil 
thick cut boneless pork chops 
salt and pepper to taste 
PREP 10 mins 
COOK 4 hrs 
READY IN 4 hrs 10 mins 
In a large bowl, whisk together the olive oil, chicken broth, garlic, paprika, garlic powder, poultry seasoning, oregano, and basil. Pour into the slow cooker. Cut small slits in each pork chop with the tip of a knife, and season lightly with salt and pepper. Place pork chops into the slow cooker, cover, and cook on High for 4 hours. Baste periodically with the sauce 
+0

@MaxPower:該腳本以其他方式工作;您需要檢查是否生成了該文件,而不是控制檯上是否有輸出。 – 2014-09-23 15:43:20

+0

@Martjin非常感謝,使用不同版本時我應該更加小心! – 2014-09-23 15:47:34