2017-12-27 578 views
-1

我正在重寫一個csv文件,我正在尋找創建一個函數,通過列表中的項目進行比較。更清楚的是,這裏是一個例子。日期值比較Python列表

我的CSV轉換爲表:

import csv 
with open('test.csv', 'rb') as csvfile: 
    spamreader = csv.reader(csvfile, delimiter=';', quotechar='|') 
    lista = list(spamreader) 
    print lista 

>>>[['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']]

因此,首先,我需要comparate馬丁和湯姆的所有值。我的意思是,item[2] of 20/12/2017 to item[2] of 21/12/2017. item[2] of 21/12/2017 to item[2] of 22/12/2017。我需要這些用於我的清單中的所有項目(項目[2,3,4,5,6]。日期是最重要的值,因爲這個想法是一天比較的。)

結果我希望是這樣的:

21/12/2017 Martin 
item[2]: smaller 
item[3]: smaller 
item[4]: bigger 
item[5]: smaller 
item[6]: smaller 

22/12/2017 Martin 
item[2]: smaller 
item[3]: bigger 
item[4]: bigger 
item[5]: bigger 
item[6]: bigger 

21/12/2017 Tom 
item[2]: smaller 
item[3]: bigger 
item[4]: bigger 
item[5]: bigger 
item[6]: bigger 

22/12/2017 Tom 
item[2]: smaller 
item[3]: smaller 
item[4]: smaller 
item[5]: smaller 
item[6]: bigger 

如果我想顯示的名稱爲「Subastas」,而不是項目[2],所有的名字太...我怎麼能做到這一點

+1

也許用'pandas'模塊 - 這是更強大。 – furas

+0

使用按鈕'{}'來格式化列表,就像你格式化的代碼一樣。 – furas

回答

2

讓我們開始呢?注意到你有一些數據的鍵是(date, name)。一個相當明顯的方法是將數據存儲在一個以(date, name)爲關鍵字的字典中。

所以,把你的發佈數據mylist

mylist = [['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']] 

轉換它(除了第一行與列標籤),以這樣的詞典:

import datetime 
mydict = {} 
for row in mylist[1:]: 
    date = datetime.datetime.strptime(row[0],'%d/%m/%Y') 
    name = row[1] 
    mydict[(date,name)] = row[2:] 

棘手位在這裏是你的日期是形式爲dd/mm/yyyy的字符串,但你稍後想要在一天和下一天之間進行比較。這並不令人意外,因爲您將此問題作爲您問題的主題。所以你需要把字符串日期轉換成你可以進行適當比較的東西。這就是strptime()所做的。

您的數據現在看起來是這樣的:

>>> mydict 
{(datetime.datetime(2017, 12, 20, 0, 0), 'Martin'): ['165.665', '3.777', '2,28%', '1,58', '0,42'], 
(datetime.datetime(2017, 12, 22, 0, 0), 'Tom'): ['60.189', '12.654', '21,02%', '11,58', '0,92'], 
(datetime.datetime(2017, 12, 21, 0, 0), 'Martin'): ['229.620', '18.508', '8,06%', '14,56', '0,79'], 
(datetime.datetime(2017, 12, 21, 0, 0), 'Tom'): ['90.962', '19.186', '21,09%', '14,26', '0,74'], 
(datetime.datetime(2017, 12, 20, 0, 0), 'Tom'): ['102.613', '20.223', '19,71%', '17,86', '0,88'], 
(datetime.datetime(2017, 12, 22, 0, 0), 'Martin'): ['204.042', '48.526', '23,78%', '43,98', '0,91']} 

下一個要觀察的是,你的數據由浮點數字和百分比,但表示爲字符串。這使事情變得複雜,因爲你想做比較。如果你比較'165.665''229.620'第一個會更小,這是你所期望的

['165.665', '3.777', ... 
    ['229.620', '18.508', ... 

:先取2個數據點的馬丁。但是,如果您將'3.777''18.508'進行比較,則第一個將會更大:不是您所期望的。這是因爲字符串按字母順序進行比較,31之後。

更糟糕的是,您的數據有時以小數點表示逗號,有時不表示。

所以你需要一個函數來對字符串進行數值轉換。這裏是一個天真的一個爲你的數據的作品,但很可能需要進行在現實生活中更穩健:

def convert(n): 
    n = n.replace(",",".").replace("%","") 
    try: 
     return float(n) 
    except ValueError: 
     return 0e0 

現在你在一個位置做比較:

for (day, name) in mydict: 
    previous_day = day - datetime.timedelta(days=1) 
    if (previous_day,name) in mydict: 
     print datetime.datetime.strftime(day,"%d/%m/%Y"), name 
     day2_values = mydict[(day, name)] 
     day1_values = mydict[(previous_day, name)] 
     comparer = zip(day2_values, day1_values) 
     for n,value in enumerate(comparer): 
      print "item[%d]:" % (n+2,), 
      if convert(value[1]) < convert(value[0]): 
       print value[1], "smaller than", value[0] 
      else: 
       print value[1], "bigger than", value[0] 
     print 

我有使消息更加明確,例如,item[2]: 165.665 smaller than 229.620。這樣,您就可以輕鬆驗證程序是否正確,而無需重新查看數據,這很容易出錯且乏味。如果你願意,你可以隨時讓這些信息不那麼明確。

22/12/2017 Tom 
item[2]: 90.962 bigger than 60.189 
item[3]: 19.186 bigger than 12.654 
item[4]: 21,09% bigger than 21,02% 
item[5]: 14,26 bigger than 11,58 
item[6]: 0,74 smaller than 0,92 

21/12/2017 Martin 
item[2]: 165.665 smaller than 229.620 
item[3]: 3.777 smaller than 18.508 
item[4]: 2,28% smaller than 8,06% 
item[5]: 1,58 smaller than 14,56 
item[6]: 0,42 smaller than 0,79 

21/12/2017 Tom 
item[2]: 102.613 bigger than 90.962 
item[3]: 20.223 bigger than 19.186 
item[4]: 19,71% smaller than 21,09% 
item[5]: 17,86 bigger than 14,26 
item[6]: 0,88 bigger than 0,74 

22/12/2017 Martin 
item[2]: 229.620 bigger than 204.042 
item[3]: 18.508 smaller than 48.526 
item[4]: 8,06% smaller than 23,78% 
item[5]: 14,56 smaller than 43,98 
item[6]: 0,79 smaller than 0,91 

要顯示"Subastas",而不是item[2],記得,列標籤是在mylist的第一個元素:

>>> mylist[0] 
['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'] 

所以將它們包括在輸出中,你需要改變這一行:

print "item[%d]:" % (n+2,), 

print mylist[0][n+2] + ":", 
+0

對於努力+1 _ –

+0

,如果我想顯示名稱爲「Subastas」的項目[2] instaed和所有名字太...我該怎麼做呢? –

+0

我需要幫助解決我的問題。 https://stackoverflow.com/questions/48001004/data-csv-dashboard-python。這是非常相似的!謝謝 –

0

您可以加載LISTA成數據幀,然後從那裏執行比較:

import pandas as pd 
import numpy as np 

headers = lista.pop(0) 

df = pd.DataFrame(lista, columns = headers) 

martin = df[df['"Cliente"'] == 'Martin'] 
tom = df[df['"Cliente"'] == 'Tom'] 

merge = pd.merge(martin, tom, on = '"Fecha"') 

stats = headers[2:] 
compare = ['"Fecha"'] 

for index, row in merge.iterrows(): 
    for x in stats: 
     merge[x+'_compare'] = np.where(row[x+'_x'] > row[x+'_y'], 'Martin', 'Tom') 
     if x+'_compare' not in compare: 
      compare.append(x+'_compare') 

print(merge[compare]) 

#output 
"Fecha" "Subastas"_compare "Impresiones_exchange"_compare "Fill_rate"_compare "Importe_a_pagar_a_medio"_compare "ECPM_medio"_compare 
20/12/2017 Tom Martin Martin Martin Tom 
21/12/2017 Tom Martin Martin Martin Tom 
22/12/2017 Tom Martin Martin Martin Tom