0
我正在解析網站以計算提及關鍵字的換行符的數量。一切都正常運行下面的代碼:使用.lower解析網站時,列表索引超出範圍()
import time
import urllib2
from urllib2 import urlopen
import datetime
website = 'http://www.dailyfinance.com/2014/11/13/market-wrap-seventh-dow-record-in-eight-days/#!slide=3077515'
topSplit = 'NEW YORK -- '
bottomSplit = "<div class=\"knot-gallery\""
# Count mentions on newlines
def main():
try:
x = 0
sourceCode = urllib2.urlopen(website).read()
sourceSplit = sourceCode.split(topSplit)[1].split(bottomSplit)[0]
content = sourceSplit.split('\n') # provides an array
for line in content:
if 'gain' in line:
x += 1
print x
except Exception,e:
print 'Failed in the main loop'
print str(e)
main()
不過,我想考慮到所有提及特定關鍵字(在這種情況下'gain'
或'Gain'
)的。反過來,我在源代碼中包含了.lower()
的閱讀。
sourceCode = urllib2.urlopen(website).read().lower()
然而,這給我的錯誤:
Failed in the main loop
list index out of range
假設.lower()
被擺脫的指數,爲什麼會發生這種情況?
很好的回答,並根據你的建議我使用'topSplit ='NEW YORK - '.lower()'讓它運行。我也會看看're'模塊,謝謝你的支持。 – Chuck