2013-12-09 30 views
0

我試圖從雅虎颳去股票價格。根據Chris Reeves的教程將它融入本地數據庫中,並且在嘗試執行此代碼時,我不斷收到上述錯誤。有誰能告訴我這裏有什麼問題嗎?謝謝。使用python正則表達式的意外結束

from threading import Thread 
import urllib 
import re 
import MySQLdb 

gmap = {} 

def th(ur): 
    base = "http://finance.yahoo.com/q?s="+ur 
    regex = '<span id="yfs_l84_'+ur.lower()+'">(.+?)</span>' 
    pattern = re.compile(regex) 
    htmltext = urllib.urlopen(base).read() 
    results = re.findall(pattern, htmltext) 
    try: 
     gmap[ur] = results[0] 
    except: 
     print "Got an error" 

symbolslist = open("multithread/stocks.txt").read() 
symbolslist = symbolslist.replace(" ","").split(",") 

print symbolslist 

threadlist = [] 

for u in symbolslist: 
    t = Thread(target=th,args=(u,)) 
    t.start() 
    threadlist.append(t) 

for b in threadlist: 
    b.join() 

這是我得到確切的錯誤:

Exception in thread Thread-1: 
Traceback (most recent call last): 
    File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner 
    self.run() 
    File "C:\Python27\lib\threading.py", line 763, in run 
    self.__target(*self.__args, **self.__kwargs) 
    File "multithread/threads.py", line 11, in th 
    pattern = re.compile(regex) 
    File "C:\Python27\lib\re.py", line 190, in compile 
    return _compile(pattern, flags) 
    File "C:\Python27\lib\re.py", line 242, in _compile 
    raise error, v # invalid expression 
error: unexpected end of regular expression 
+0

我假設錯誤發生在'regex ='(。+?)''? – ApproachingDarknessFish

+2

感嘆。用正則表達式解析HTML的常見警告。 –

+1

你需要在構造正則表達式時共享完整的正則表達式或者至少分享ur.lower()的值。 –

回答

0

唉,你沒告訴我們的重要組成部分。那就是,打印symbolslist某些東西在您將其粘貼到<span ...樣板時會在該列表中創建無效的正則表達式。

你或許可以通過改變該行像這樣解決這個問題:

regex = '<span id="yfs_l84_' + re.escape(ur.lower()) + '">(.+?)</span>' 
            ^^^^^^^^^^  ^

然而,如果這樣可以的話,大概只能躲在真正的問題。真正的問題可能是你在symbolslist有一些廢話。