1
我有一個DataFrame包含一組在不同時間測量的值。我想將每天的價值規範化爲一個。如何才能做到這一點?在熊貓數據框中,每天的值如何歸一化?
具體而言,我有以下形式的數據:
value
datetime
2017-03-08 14:36:06.616166 1002.49
2017-03-08 15:06:07.661818 992.68
2017-03-08 15:36:08.597443 984.34
2017-03-08 16:06:09.265451 989.32
2017-03-08 16:36:10.581452 1004.00
2017-03-08 17:06:11.269434 1003.97
2017-03-08 17:36:12.117443 994.80
2017-03-08 18:06:12.809445 994.17
2017-03-08 18:36:14.029444 997.93
2017-03-08 19:06:14.654631 989.65
2017-03-08 19:36:15.413438 991.14
2017-03-08 20:06:16.145432 984.65
2017-03-08 20:36:17.265443 993.30
2017-03-08 21:06:18.117434 981.18
2017-03-08 21:36:19.165447 987.64
2017-03-08 22:06:19.909443 985.26
2017-03-08 22:36:20.569442 980.40
2017-03-08 23:06:21.197446 988.59
2017-03-08 23:36:21.989448 984.59
2017-03-09 00:06:22.665448 983.91
2017-03-09 00:36:23.281681 993.65
2017-03-09 01:06:23.857440 986.69
2017-03-09 01:36:24.441713 984.04
2017-03-09 02:06:25.117453 989.92
2017-03-09 02:36:25.953449 978.82
2017-03-09 03:06:26.521704 987.42
2017-03-09 03:36:27.157448 996.66
2017-03-09 04:06:27.725445 996.66
2017-03-09 04:36:29.201442 996.66
2017-03-09 05:06:29.765443 989.82
... ...
2017-03-22 20:16:24.007637 833.74
2017-03-22 20:46:24.583127 834.69
2017-03-22 21:16:25.217536 829.66
我想分別正常化的所有值的2017年3月8日,2017年3月9日等,並添加這些標準化值作爲新列。
的值列表一個簡單的標準化功能如下:
def normalize(x, summation = None):
if summation is None:
summation = sum(x) # normalize to unity
return [element/summation for element in x]
因此,對於2017年3月8日,歸一化值將如下所示:
value value_day_normalized
datetime
2017-03-08 14:36:06.616166 1002.49 0.0532386976171
2017-03-08 15:06:07.661818 992.68 0.0527177232197
2017-03-08 15:36:08.597443 984.34 0.0522748153223
2017-03-08 16:06:09.265451 989.32 0.0525392855057
2017-03-08 16:36:10.581452 1004.00 0.0533188883755
2017-03-08 17:06:11.269434 1003.97 0.0533172951817
2017-03-08 17:36:12.117443 994.80 0.0528303089203
2017-03-08 18:06:12.809445 994.17 0.0527968518489
2017-03-08 18:36:14.029444 997.93 0.052996532148
2017-03-08 19:06:14.654631 989.65 0.0525568106383
2017-03-08 19:36:15.413438 991.14 0.0526359392674
2017-03-08 20:06:16.145432 984.65 0.0522912783257
2017-03-08 20:36:17.265443 993.30 0.0527506492265
2017-03-08 21:06:18.117434 981.18 0.0521069989007
2017-03-08 21:36:19.165447 987.64 0.0524500666486
2017-03-08 22:06:19.909443 985.26 0.0523236732678
2017-03-08 22:36:20.569442 980.40 0.0520655758599
2017-03-08 23:06:21.197446 988.59 0.052500517788
2017-03-08 23:36:21.989448 984.59 0.0522880919379
哪有這樣的事情可以完成嗎?我有一種感覺,它可能涉及使用DataFrame方法groupby
,但我不知道我應該如何處理這個問題。