2017-07-06 81 views
-3

我的數據集是由下面的代碼星火數據集相關

+------+---------------+----+ 
| City|  Timestamp|Sale| 
+------+---------------+----+ 
|City 3|6/30/2017 16:04| 28| 
|City 4| 7/4/2017 16:04| 12| 
|City 2|7/13/2017 16:04| 8| 
|City 4|7/16/2017 16:04| 21| 
|City 4| 7/3/2017 16:04| 24| 
|City 2|7/17/2017 16:04| 34| 
|City 3| 7/9/2017 16:04| 13| 
|City 3|7/18/2017 16:04| 26| 
|City 3| 7/6/2017 16:04| 16| 
|City 3|7/15/2017 16:04| 29| 
|City 4|7/18/2017 16:04| 39| 
|City 2| 7/1/2017 16:04| 19| 
|City 2|7/18/2017 16:04| 19| 
|City 4| 7/4/2017 16:04| 24| 
|City 2| 7/4/2017 16:04| 9| 
|City 4|7/15/2017 16:04| 20| 
|City 3|7/12/2017 16:04| 19| 
|City 1| 7/9/2017 16:04| 13| 
|City 1|7/13/2017 16:04| 25| 
|City 4|7/10/2017 16:04| 10| 
+------+---------------+----+ 

我們需要計算在本週底部的每一CitySale總和看起來像這樣。

+0

火花斯卡拉提供解決方案 –

回答

0

您可以按CityTime stamp,總結了Sales

data.groupBy("City", "TimeStamp").agg(sum(col("Sale")).as("TotalSale")).show 

希望這有助於!

+0

嗨,它工作得很好,但我需要明智的一週即。明智的每個城市的銷售總額 –

+0

嗨,它工作正常,但我需要每週明智的總和即即每週我需要在每個城市的銷售總額[城市1週五01/06/2017 \t 38] –

+0

如何可能某人確定周從你給定的輸入? –