無法擺脫csv輸出中的空白行

-1

我已經在python scrapy中編寫了一個非常小的腳本來解析黃頁網站中多個頁面顯示的姓名，街道和電話號碼。當我運行我的腳本時，我發現它運行順利。但是，我遇到的唯一問題是數據在csv輸出中被抓取的方式。它總是兩行之間的一行（行）間隙。我的意思是：數據正在每隔一行打印。看到下面的圖片，你會明白我的意思。如果不是用於scrapy，我可以使用[newline ='']。但不幸的是，我在這裏完全無奈。我如何擺脫csv輸出中出現的空白行？預先感謝您看看它。無法擺脫csv輸出中的空白行

items.py包括：

import scrapy 

class YellowpageItem(scrapy.Item): 
    name = scrapy.Field() 
    street = scrapy.Field() 
    phone = scrapy.Field()

這裏是蜘蛛：

import scrapy 

class YellowpageSpider(scrapy.Spider): 
    name = "YellowpageSp" 
    start_urls = ["https://www.yellowpages.com/search?search_terms=Pizza&geo_location_terms=Los%20Angeles%2C%20CA&page={0}".format(page) for page in range(2,6)] 

    def parse(self, response): 
     for titles in response.css('div.info'): 
      name = titles.css('a.business-name span[itemprop=name]::text').extract_first() 
      street = titles.css('span.street-address::text').extract_first() 
      phone = titles.css('div[itemprop=telephone]::text').extract_first() 
      yield {'name': name, 'street': street, 'phone':phone}

這裏是CSV輸出看起來像：

順便說一句，該命令我用來獲取CSV輸出是：

scrapy crawl YellowpageSp -o items.csv -t csv

來源

2017-08-27 SIM

我很快就說過了。這對我有效。我在投票答覆和問題：D – 2017-12-02 18:08:39

您可以通過創建一個新的FeedExporter來修復它。更改settings.py如下

FEED_EXPORTERS = { 
    'csv': 'project.exporters.FixLineCsvItemExporter', 
}

在項目中創建一個exporters.py

exporters.py

import io 
import os 
import six 
import csv 

from scrapy.contrib.exporter import CsvItemExporter 
from scrapy.extensions.feedexport import IFeedStorage 
from w3lib.url import file_uri_to_path 
from zope.interface import implementer 


@implementer(IFeedStorage) 
class FixedFileFeedStorage(object): 

    def __init__(self, uri): 
     self.path = file_uri_to_path(uri) 

    def open(self, spider): 
     dirname = os.path.dirname(self.path) 
     if dirname and not os.path.exists(dirname): 
      os.makedirs(dirname) 
     return open(self.path, 'ab') 

    def store(self, file): 
     file.close() 


class FixLineCsvItemExporter(CsvItemExporter): 

    def __init__(self, file, include_headers_line=True, join_multivalued=',', **kwargs): 
     super(FixLineCsvItemExporter, self).__init__(file, include_headers_line, join_multivalued, **kwargs) 
     self._configure(kwargs, dont_fail=True) 
     self.stream.close() 
     storage = FixedFileFeedStorage(file.name) 
     file = storage.open(file.name) 
     self.stream = io.TextIOWrapper(
      file, 
      line_buffering=False, 
      write_through=True, 
      encoding=self.encoding, 
      newline="", 
     ) if six.PY3 else file 
     self.csv_writer = csv.writer(self.stream, **kwargs)

我在Mac上，所以無法測試其Windows行爲。但是，如果以上不起作用，然後改變部分代碼，並設置newline="\n"

 self.stream = io.TextIOWrapper(
      file, 
      line_buffering=False, 
      write_through=True, 
      encoding=self.encoding, 
      newline="\n", 
     ) if six.PY3 else file

來源

2017-08-27 20:56:42

非常感謝塔倫。你真的是巨蟒。多年來我一直在遭受這個問題的困擾。創建並刪除了幾個線程來查找解決方案。你只是解決了一切。我希望我能夠點擊百萬次按鈕。順便說一下，我可以在其他項目中使用它，因爲除了跟上步伐之外，這對我來說非常巨大而且難以創造。 Thanksssssssssssssssss a zillion。 – SIM

你可以在任何地方使用它 –

最後一件事要知道我是不是已經太貪心了。有時候，列的位置改變了。雖然，這不是一個大問題，但我很想知道。你可以理解你是否看到鏈接。最重要的是，它與此線程無關，所以請隨時忽略。 https://www.dropbox.com/s/tzztbli7v6quhc2/column%20order.txt?dl=0 – SIM

無法擺脫csv輸出中的空白行

回答

相關問題