如何分割逗號分隔的數據並從python中的數據創建一個列表？

我正在嘗試創建一個函數，它採用YYYY/MM/DD格式的兩個日期，讀取數據並返回包含兩個日期之間地震的緯度，經度，大小和深度的列表列表。該數據的格式如下：如何分割逗號分隔的數據並從python中的數據創建一個列表？

Date,TimeUTC,Latitude,Longitude,Magnitude,Depth 
2012/02/23,08:09:13.0,-20.984,-178.654,4.6,526

這是我的嘗試：

from tempBetweenDates import dateLessThan 
import urllib.request 

def betweenDates(date1, date2, date3): 
    """Determines if the first date is on the second or between the second and third date.""" 
    date_1 = date1.split('/') 
    date_2 = date2.split('/') 
    date_3 = date3.split('/') 
    if int(date_1[0]) >= int(date_2[0]) and int(date_1[1]) >= int(date_2[1]) and int(date_1[2]) >= int(date_2[2]) and dateLessThan(int(date_1[1]), int(date_1[2]), int(date_1[0]), int(date_3[1]), int(date_3[2]), int(date_3[0])) == True: 
    return True 
else: 
    return False 

def parseEarthquakeData(date1, date2): 
    page = urllib.request.urlopen("http://www.choongsoo.info/teach/mcs177-sp12/projects/earthquake/earthquakeData-02-23-2012.txt") 
    eqdata = page.readlines() 
    dataList = [] 
    for line in eqdata: 
     lineSplit = line.split(',') 
     date = lineSplit[0] 
     data = lineSplit[2:6] 
     dataList = [[data] for line in eqdata if betweenDates(date, date1, date2) == True] 
    return(dataList)

每當我嘗試和運行代碼我得到一個錯誤：

Traceback (most recent call last): 
    File "<pyshell#2>", line 1, in <module> 
    parseEarthquakeData("2012/02/22", "2012/02/19") 
    File "C:\Users\lcooper2\Desktop\Python\PROJECTS\plotEarthquakes.py", line 20, in parseEarthquakeData 
    lineSplit = line.split(',') 
TypeError: Type str doesn't support the buffer API

如何任何提示避免這個錯誤？

來源

2015-04-04 Logan Cooper

聖CRUD做你需要去發現'datetime'模塊！ :) – 2015-04-04 03:12:48

如果（betweenDates（date，date1，date2）） – CY5 2015-04-04 03:24:13

在python 3.X中，urllib.response.readlines返回一個字節字符串，python 3被認爲是更安全的類型，並且友好的編碼不支持方法中所有不同的編碼字符串。

所以你的split方法實際上是在一個字節串上調用的，它需要一個字節而不是一個字符串。

因此，無論您將數據轉換回字符串

lineSplit = str(lineSplit)

或傳遞一個字節的字符串分隔符

lineSplit = line.split(b',')

來源

2015-04-04 03:29:02 Abhijit

實際上，你可以做些什麼樣的同此涼！如果您通過csv.DictReader通過urllib.request.urlopen撥打電話回覆您的回覆，則可以消除大量的分組和分配。

import csv 
import datetime 
import urllib.request 

page = urllib.request.urlopen("http://www.choongsoo.info/teach/mcs177-sp12/projects/earthquake/earthquakeData-02-23-2012.txt") 
reader = csv.DictReader((line.decode() for line in page), delimiter=',') 

for line in reader: 
    # each line looks like: 
    # {'Longitude': '-178.654', 'Date': '2012/02/23', 
    # 'Depth': '526', 'Magnitude': '4.6', 'Latitude': '-20.984', 
    # 'TimeUTC': '08:09:13.0'} 
    # so you can use it like a dictionary! 
    date = datetime.datetime.strptime(line['Date'], "%Y/%m/%d") 
    # datetime objects like this aren't naive like numbers, so you can do: 
    # datetime.datetime(year=2012, month=2, day=23) < datetime.datetime(year=2012, month=2, day=24) 
    # and expect it to return True every time. This will massively simplify your 
    # betweenDates function.

在追蹤錯誤的原因是urllib.request.urlopen給你一個HTTPResponse對象。這是一個迭代器，它爲您提供bytes對象，而不是string對象。調用bytes.decode()會將它們變成字符串，所以你可以像分裂它們一樣對它們做一些粘性的事情。

如果更改爲使用這些datetime對象，你betweenDates函數變爲：

def between_dates(date1, date2, date3): 
    return date2 <= date1 < date3

來源

2015-04-04 03:33:19

避免使用if（betweenDates（date，date1，date2）== True）而不是使用我的答案有問題嗎？如果這是downvote的原因，我很想糾正它。 – 2015-04-04 03:46:24

+1正義爲 – wim 2015-04-04 05:15:01

NMDV，但我不確定逐行解碼。從概念上講，我不確定線路是否被定義，直到文件被解碼爲止 - 分離工作比其他任何事情都更加巧合。爲什麼不簡單地解碼然後迭代splitline的結果？ – DSM 2015-04-04 15:17:12

如何分割逗號分隔的數據並從python中的數據創建一個列表？

回答

相關問題