2013-06-29 90 views
1

我試圖訪問這個URL:如何變量添加到URL參數urllib中

http://ichart.finance.yahoo.com/table.csv?s=GOOG&a=05&b=20&c=2013&d=05&e=28&f=2013&g=d&ignore=.csv

但不是總是被GOOG將無論是在變量ticker_list輸入像這樣:

當我做到這一點,它的工作原理:

URL = urllib.request.urlopen("http://ichart.finance.yahoo.com/table.csv?s=GOOG&a=05&b=20&c=2013&d=05&e=28&f=2013&g=d&ignore=.csv") 
html = URL.read() 
print (html) 

但是,如果我這樣做:

filename = input("Please enter file name to extract data from: ") 
with open(filename) as f: 
    data = f.readlines() # Read the data from the file 

tickers_list = [] 
for line in data: 
    tickers_list.append(line) # Separate tickers into individual elements in list 

print (tickers_list[0]) # Check if printing correct ticker 
url = "http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2011&d=05&e=28&f=2013&g=d&ignore=.csv" % str(tickers_list[0]) 
print (url) # Check if printing correct URL 

URL = urllib.request.urlopen(url) 
html = URL.read() 
print (html) 

,給我這個錯誤:

urllib.error.URLError: <urlopen error no host given> 

我不是在做正確的字符串格式化?

+0

你並不需要循環'data'將一切'tickers_list';你可以直接使用'data' *因爲'f.readlines()'返回一個列表。 –

回答

2

您從文件名讀取的數據包括每行末尾的換行符(.readlines()不會將其刪除)。你應該自己刪除它; str.strip()刪除所有空格,換行,包括:

filename = input("Please enter file name to extract data from: ") 
with open(filename) as f: 
    tickers_list = f.readlines() # .readlines() returns a list *already* 

print(tickers_list[0].strip()) 
url = "http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2011&d=05&e=28&f=2013&g=d&ignore=.csv" % tickers_list[0].strip() 
print(url) 

response = urllib.request.urlopen(url) 
html = response.read() 
print(html) 

你並不需要呼籲tickers_list[0]元素str(),因爲從文件中讀取已經導致字符串列表。此外,%s格式化佔位符將其值轉換爲字符串,如果它尚未。

隨着一個換行符(下repr()輸出\n字符),你得到確切的錯誤你看到:

>>> url = "http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2011&d=05&e=28&f=2013&g=d&ignore=.csv" % 'GOOG\n' 
>>> print(repr(url)) 
'http://ichart.finance.yahoo.com/table.csv?s=GOOG\n&a=00&b=1&c=2011&d=05&e=28&f=2013&g=d&ignore=.csv' 
>>> urllib.request.urlopen(url) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/Users/mj/Development/Libraries/buildout.python/parts/opt/lib/python3.3/urllib/request.py", line 156, in urlopen 
    return opener.open(url, data, timeout) 
    File "/Users/mj/Development/Libraries/buildout.python/parts/opt/lib/python3.3/urllib/request.py", line 467, in open 
    req = meth(req) 
    File "/Users/mj/Development/Libraries/buildout.python/parts/opt/lib/python3.3/urllib/request.py", line 1172, in do_request_ 
    raise URLError('no host given') 
urllib.error.URLError: <urlopen error no host given> 

如果你要處理來自文件輸入剛剛一個線,使用f.readline()來閱讀這一行,並保存自己的索引列表。你仍然需要去掉換行符。

如果你要處理所有行,只是循環直接在輸入文件,這與分別得到每一行,再次換行符:

with open(filename) as f: 
    for ticker_name in f: 
     ticker_name = ticker_name.strip() 
     url = "http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2011&d=05&e=28&f=2013&g=d&ignore=.csv" % ticker_name 

     # etc. 
2

在Python中操縱的URL我會建議兩個解決方案:furlURLObject。這兩個庫爲您提供了非常好的界面來輕鬆操作網址。

來自實例furl文檔:

 
>>> from furl import furl 
>>> f = furl('http://www.google.com/?one=1&two=2') 
>>> f.args['three'] = '3' 
>>> del f.args['one'] 
>>> f.url 
'http://www.google.com/?two=2&three=3' 
+0

謝謝,我會考慮+1 – Goose