0
因此,我正在收集股票列表中的數據,並將所有這些信息放入數據框中。該名單約有700只股票。有沒有加快這個webscraping迭代的方法?熊貓
import pandas as pd
stock =['adma','aapl','fb'] # list has about 700 stocks which I extracted from a pickled dataframe that was storing the info.
#The site I'm visiting is below with the name of the stock added to the end of the end of the link
##http://finviz.com/quote.ashx?t=adma
##http://finviz.com/quote.ashx?t=aapl
我只是通過提取該網站的一個部分,明顯[-2]下面
df2 = pd.DataFrame()
for i in stock:
df = pd.read_html('http://finviz.com/quote.ashx?t={}'.format(i), header =0)[-2].set_index('SEC Form 4')
df['Stock'] = i.upper() # creating a column which has the name of the stock, so I can differentiate between stocks
df2 = df2.append(df)
代碼感覺就像我在做每次迭代幾秒鐘,我有目前大概需要700個。這不是非常緩慢,但我只是好奇,如果有一個更有效的方法。謝謝。
檢查我的[問題](http://stackoverflow.com/questions/40641166/how-to-add-an-id-column-to-identify-read-html-tables),可能這可以幫助你。 – tumbleweed