將Ints和Floats的字符串轉換爲CSV中的單個Ints和浮點數

-1

我正在使用scrapy來刮取股票的上市前數據。這裏是用來颳去網站代碼：將Ints和Floats的字符串轉換爲CSV中的單個Ints和浮點數

def parse(self, response): 
    for sel in response.xpath('//body'): 
     item = PremarketItem() 
     item['volume'] = sel.xpath('//td[@class="tdVolume"]/text()').extract() 
     item['last_price'] = sel.xpath('//div[@class="lastPrice"]/text()')[:30].extract() 
     item['percent_change'] = sel.xpath(
     '//div[@class="chgUp"]/text()')[:15].extract() + sel.xpath('//div[@class="chgDown"]/text()')[:15].extract() 
     item['ticker'] = sel.xpath('//a[@class="symbol"]/text()')[:30].extract() 
     yield item

下面的代碼的輸出爲.csv文件是沿着此線的東西：

ticker,percent_change,last_price,volume 
"HTGM,SNCR,SAEX,IMMU,OLED,DAIO","27.43%,20.39%,17.28%,17.19%,15.69%","5,298350,700,1090000,76320,27190,13010",etc

正如你所看到的，值被正確分隔，但它們都被卡在大量的字符串中。我已經嘗試了多個for循環，但沒有任何工作，我找不到任何東西。感謝您的幫助！

來源

2017-05-06 Ian Scalzo

除了分割大量字符串，您可以修復scrapy代碼，以便首先將值分開。

您的項目XPaths以//選擇與您的規範匹配的所有元素開始，從而輸出一個（大量）項目中的所有元素。我想你的目標網站對於目標物品有一些結構，例如錶行。

然後，您需要找出與行相匹配的XPath表達式，並在這些行上循環來解析每行的一個項目。請參閱以下僞代碼：

def parse(self, response): 

    # Loop over table rows ... 
    for sel in response.xpath('//table/tr'): 

     item = PremarketItem() 
     # Use XPath starting in table row: Use dot at beginning 
     item['volume'] = sel.xpath('./td[@class="tdVolume"]/text()').extract() 
     # ... other fields ... 
     yield item

有關相對XPath表達式的示例，請參閱scrapy documentation。

來源

2017-05-08 07:18:04

將Ints和Floats的字符串轉換爲CSV中的單個Ints和浮點數

回答

相關問題