Canopy上用Python進行網頁掃描

我在這段代碼中遇到了麻煩，在這段代碼中，我想列出所列公司的4個股票價格。我的問題是，雖然我運行它時沒有錯誤，但代碼僅在股票價格應該去的地方打出空括號。這是我混亂的根源。Canopy上用Python進行網頁掃描

import urllib2 
import re 

symbolslist = ["aapl","spy","goog","nflx"] 
i = 0 

while i<len(symbolslist): 
    url = "http://money.cnn.com/quote/quote.html?symb=' +symbolslist[i] + '" 
    htmlfile = urllib2.urlopen(url) 
    htmltext = htmlfile.read() 
    regex = '<span stream='+symbolslist[i]+' streamformat="ToHundredth" streamfeed="SunGard">(.+?)</span>' 
    pattern = re.compile(regex) 
    price = re.findall(pattern,htmltext) 
    print "the price of", symbolslist[i], " is ", price 
    i+=1

來源

2016-09-16 Kainesplain

因爲你不傳遞變量：

url = "http://money.cnn.com/quote/quote.html?symb=' +symbolslist[i] + '" 
                 ^^^^^ 
                 a string not the list element

使用str.format：

url = "http://money.cnn.com/quote/quote.html?symb={}".format(symbolslist[i])

您也可以在列表中直接迭代，無需使用while循環，從來沒有parse html with a regex，使用HTML解析像bs4和你的正則表達式也是錯誤的。沒有stream="aapl"等。你想要的是跨度在哪裏streamformat="ToHundredth"和streamfeed="SunGard";

import urllib2 
from bs4 import BeautifulSoup 

symbolslist = ["aapl","spy","goog","nflx"] 


for symbol in symbolslist: 
    url = "http://money.cnn.com/quote/quote.html?symb={}".format(symbol) 
    htmlfile = urllib2.urlopen(url) 
    soup = BeautifulSoup(htmlfile.read()) 
    price = soup.find("span",streamformat="ToHundredth", streamfeed="SunGard").text 
    print "the price of {} is {}".format(symbol,price)

你可以看到，如果我們運行代碼：

In [1]: import urllib2 

In [2]: from bs4 import BeautifulSoup 

In [3]: symbols_list = ["aapl", "spy", "goog", "nflx"] 

In [4]: for symbol in symbols_list: 
    ...:   url = "http://money.cnn.com/quote/quote.html?symb={}".format(symbol) 
    ...:   htmlfile = urllib2.urlopen(url) 
    ...:   soup = BeautifulSoup(htmlfile.read(), "html.parser") 
    ...:   price = soup.find("span",streamformat="ToHundredth", streamfeed="SunGard").text 
    ...:   print "the price of {} is {}".format(symbol,price) 
    ...:  
the price of aapl is 115.57 
the price of spy is 215.28 
the price of goog is 771.76 
the price of nflx is 97.34

我們得到你想要的。

來源

2016-09-16 11:07:53

現在我收到此錯誤後嘗試您的代碼。它標記第11行，並說： AttributeError：'NoneType'對象沒有任何屬性'text' – Kainesplain

代碼輸出包含在答案中，如果使用答案中的符號得到不同的輸出，則代碼使用不正確或出於某種原因，您沒有獲得正確的來源。告訴我，如果沒有更多上下文，就會得到一個屬性錯誤，這有點難以調試 –

Canopy上用Python進行網頁掃描

回答

相關問題