將數值向量的整數向量保留爲整數

How to round floats to integers while preserving their sum?具有以僞代碼寫入的下面的answer，該向量將向量舍入爲整數值，以使元素總和保持不變並且舍入誤差最小化。我想在R.有效地實現這一點（即矢量如果可能）將數值向量的整數向量保留爲整數

例如，舍入這些號碼產生不同的總：

set.seed(1) 
(v <- 10 * runif(4)) 
# [1] 2.655087 3.721239 5.728534 9.082078 
(v <- c(v, 25 - sum(v))) 
# [1] 2.655087 3.721239 5.728534 9.082078 3.813063 
sum(v) 
# [1] 25 
sum(round(v)) 
# [1] 26

複製的僞代碼從answer參考

// Temp array with same length as fn. 
tempArr = Array(fn.length) 

// Calculate the expected sum. 
arraySum = sum(fn) 

lowerSum = 0 
-- Populate temp array. 
for i = 1 to fn.lengthf 
    tempArr[i] = { result: floor(fn[i]),    // Lower bound 
        difference: fn[i] - floor(fn[i]), // Roundoff error 
        index: i }       // Original index 

    // Calculate the lower sum 
    lowerSum = lowerSum + tempArr[i] + lowerBound 
end for 

// Sort the temp array on the roundoff error 
sort(tempArr, "difference") 

// Now arraySum - lowerSum gives us the difference between sums of these 
// arrays. tempArr is ordered in such a way that the numbers closest to the 
// next one are at the top. 
difference = arraySum - lowerSum 

// Add 1 to those most likely to round up to the next number so that 
// the difference is nullified. 
for i = (tempArr.length - difference + 1) to tempArr.length 
    tempArr.result = tempArr.result + 1 
end for 

// Optionally sort the array based on the original index. 
array(sort, "index")

來源

2015-09-12 Max Ghenis

在一個更簡單的形式，我會說這個算法是：

從一切向下舍入開始
收小數部分最高的數字，直到達到所需的總和。
1. 與floor
2. 訂單號輪下跌的小數部分（使用order）
3. 使用tail搶指數：
這可以在矢量化方式，通過實施中的R與k個最大小數部分，其中k是，我們需要增加的總和，達到我們的目標值
增量產值量元素在這些指數由1

在代碼：

smart.round <- function(x) { 
    y <- floor(x) 
    indices <- tail(order(x-y), round(sum(x)) - sum(y)) 
    y[indices] <- y[indices] + 1 
    y 
} 
v 
# [1] 2.655087 3.721239 5.728534 9.082078 3.813063 
sum(v) 
# [1] 25 
smart.round(v) 
# [1] 2 4 6 9 4 
sum(smart.round(v)) 
# [1] 25

來源

2015-09-13 00:00:04 josliber

感謝這個實用的功能！我想補充的答案，如果四捨五入至小數點後的指定數量時，功能可以修改：

smart.round <- function(x, digits = 0) { 
    up <- 10^digits 
    x <- x * up 
    y <- floor(x) 
    indices <- tail(order(x-y), round(sum(x)) - sum(y)) 
    y[indices] <- y[indices] + 1 
    y/up 
}

來源

2016-03-11 00:53:31 Mikhail

運行總計和差異爲基礎的方法相比smartRound更快的@josliber：

diffRound <- function(x) { 
    diff(c(0, round(cumsum(x)))) 
}

以下結果如何比較以100萬的記錄（在這裏看到的細節：Running Rounding）：

res <- microbenchmark(
    "diff(dww)" = x$diff.rounded <- diffRound(x$numbers) , 
    "smart(josliber)"= x$smart.rounded <- smartRound(x$numbers), 
    times = 100 
)

Unit: milliseconds 
expr   min  lq  mean  median  uq  max  neval 
diff(dww)  38.79636 59.70858 100.6581 95.4304 128.226 240.3088 100 
smart(josliber) 466.06067 719.22723 966.6007 1106.2781 1177.523 1439.9360 100

來源

2016-05-01 18:59:12 Bulat

儘管此解決方案比我的解決方案更快，並且保留了所需的總和，但不幸的是，它不會最小化問題中所要求的舍入誤差。例如，與'v < - C（2.655087，3.721239，5.728534，9.082078，3.813063）'，'的所述smart.round'舍入誤差，'總和（ABS（smart.round（ⅴ）-v））'，返回1.474329，而'diffRound'，'sum（abs（diffRound（v）-v））'的舍入誤差返回1.606633。 – josliber

這是事實，總體舍入誤差不會被diff方法最小化。我不確定這是否是該問題的強烈要求。 – Bulat

原始問題有這個條款：「考慮到滿足這四個條件，使舍入方差（sum（（in [i] - fn [i]）^ 2））最小化的算法是可取的，但這不是一個大問題「。 – Bulat

將數值向量的整數向量保留爲整數

回答

相關問題