我正在執行一些處理相當大量的數據。我做了多次測試,記錄了一些不變的記錄(100萬,1000萬和1億),並測量了執行時間(1)。所以,我有結果如下CSV(列如下:記錄,額外的處理,時間,用戶時間,SYS時數):處理時間預測
1000000,false,4.29,13.62,0.48
1000000,true,8.78,28.28,0.89
10000000,false,69.17,229.20,8.26
10000000,true,106.89,343.34,11.78
100000000,false,1053.46,3058.38,126.66
100000000,true,1255.68,4011.54,143.87
1000000,false,8.40,27.86,1.01
1000000,true,12.59,40.75,1.44
10000000,false,92.84,309.81,10.85
10000000,true,125.52,410.81,14.06
100000000,false,963.49,2935.52,116.03
100000000,true,1435.18,4238.75,154.30
1000000,false,9.12,29.94,1.14
1000000,true,12.90,42.21,1.48
10000000,false,96.32,321.50,11.65
10000000,true,122.68,400.36,13.92
100000000,false,872.66,2876.10,109.40
100000000,true,1170.53,3771.05,131.80
1000000,false,11.07,36.70,1.28
1000000,true,13.21,43.15,1.44
10000000,false,94.08,312.17,11.42
10000000,true,126.83,411.92,14.10
100000000,false,870.20,2861.60,109.60
100000000,true,1138.72,3692.30,127.56
1000000,false,8.60,28.48,1.04
1000000,true,13.14,42.88,1.48
10000000,false,87.76,290.91,10.50
10000000,true,118.03,382.60,12.80
100000000,false,858.91,2822.96,106.71
100000000,true,1190.48,3857.58,133.79
1000000,false,8.91,29.59,1.00
1000000,true,12.91,42.01,1.55
10000000,false,89.62,296.94,11.00
10000000,true,116.50,378.21,12.77
100000000,false,870.43,2858.22,109.46
100000000,true,1126.05,3641.41,127.34
1000000,false,9.46,31.40,1.20
1000000,true,11.12,36.28,1.17
10000000,false,87.26,289.12,10.78
10000000,true,115.46,372.48,12.70
100000000,false,1044.48,3029.55,121.52
100000000,true,1393.75,4083.24,147.38
1000000,false,9.75,30.62,1.24
1000000,true,14.79,45.33,1.52
10000000,false,99.32,317.52,12.20
10000000,true,150.65,428.98,16.02
100000000,false,916.92,2979.20,115.72
100000000,true,1119.58,3619.34,126.22
1000000,false,8.85,29.42,1.04
1000000,true,12.47,40.42,1.40
10000000,false,94.12,312.18,11.27
10000000,true,121.16,393.87,13.56
100000000,false,884.21,2898.08,110.16
100000000,true,1131.85,3655.16,128.92
1000000,false,8.86,29.51,1.08
1000000,true,12.32,40.12,1.21
10000000,false,89.75,298.62,10.80
10000000,true,114.46,371.82,12.69
100000000,false,868.67,2842.56,109.55
100000000,true,1139.24,3680.05,127.93
如何預測處理,對於時間例如,十億條記錄?我打算使用R來將數據可視化。
@ZheyuanLi:「在多處理器機器上,多線程進程或分叉子進程的時間可能比總CPU時間少 - 因爲不同的線程或進程可能並行運行。」 http://stackoverflow.com/a/556411/3656424 –
@ZheyuanLi哦,我沒有想到這很重要。但是,如果實際上我使用[goroutines](https://golang.org/doc/effective_go.html#goroutines)在[Golang](https://golang.org/)中進行數據處理。 –
這個問題屬於stats.stackexchange.com – user31264