MongoDB：如何在100個集合中找到10個隨機文檔？

MongoDB是否能夠爲多個隨機文檔提供資金而無需進行多重查詢？MongoDB：如何在100個集合中找到10個隨機文檔？

例如我在集合中加載所有文檔後在JS端實現，這是浪費 - 因此只是想檢查一下db查詢是否可以做得更好？

我承擔了JS側的路徑：

得到的所有數據
使一組ID標識的
洗牌陣列（不分先後）
拼接陣列到需要的文件號碼
通過按前兩次操作後留下的ID選擇它們來創建文件列表，從整個集合中逐一刪除

兩個主要缺點是我正在加載所有數據 - 或者我做了多個查詢。

任何建議非常讚賞

來源

2014-07-17 Iladarsda

真的只是100個100個文件嗎？如果是這樣，那麼爲什麼要優化目前的解決方案？ – Prinzhorn

好吧，這只是一個例子，我預計這個系列會成長爲1000年 – Iladarsda

這是回答很久以前，從那時起，MongoDB的極大發展。

，張貼在另一個答案，MongoDB的現在支持sampling within the Aggregation Framework自3.2版：

你可以做到這一點的方法是：

db.products.aggregate([{$sample: {size: 5}}]); // You want to get 5 docs

或者：

db.products.aggregate([ 
    {$match: {category:"Electronic Devices"}}, // filter the results 
    {$sample: {size: 5}} // You want to get 5 docs 
]);

不過，也有some warnings約$ sample運算符：

（as o ˚F月，6H 2017年，在最新的版本是3.4）=>如果有任何的這種不符合：

$樣品是管道的第一階段
N是總文檔的不到5％收集
在該集合包含超過100個文檔

如果上述任何條件未能滿足，$樣品進行收集掃描隨後接着一個隨機排序來選擇N個文檔。

像與$匹配

OLD回答最後一個例子

你總是可以運行：

db.products.find({category:"Electronic Devices"}).skip(Math.random()*YOUR_COLLECTION_SIZE)

但訂單不會是隨機的，你會需要兩個查詢（一個計數來獲得YOUR_COLLECTION_SIZE）或估計它有多大（它大約有100條記錄，大約1000條，大約10000條......）

您也可以爲所有帶有隨機數的文檔添加一個字段，並按該數字查詢。這裏的缺點是每次運行相同的查詢時都會得到相同的結果。爲了解決這個問題，你總是可以玩限制和跳過，甚至可以排序。你可以在每次獲取記錄時更新這些隨機數（意味着更多的查詢）。

- 我不知道你是使用Mongoose，Mondoid還是直接使用任何特定語言的Mongo Driver，所以我會寫關於mongo shell的所有信息。

因此你，讓我們說，產品記錄是這樣的：

{ 
_id: ObjectId("..."), 
name: "Awesome Product", 
category: "Electronic Devices", 
}

，我會建議使用：

{ 
_id: ObjectId("..."), 
name: "Awesome Product", 
category: "Electronic Devices", 
_random_sample: Math.random() 
}

然後，你可以這樣做：

db.products.find({category:"Electronic Devices",_random_sample:{$gte:Math.random()}})

那麼您可以定期運行，以便定期更新文檔的_random_sample字段：

var your_query = {} //it would impact in your performance if there are a lot of records 
your_query = {category: "Electronic Devices"} //Update 
//upsert = false, multi = true 
db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)

或只是何時你的一些記錄，你可以更新所有的人或少數幾個（取決於你有多少條記錄檢索）

for(var i = 0; i < records.length; i++){ 
    var query = {_id: records[i]._id}; 
    //upsert = false, multi = false 
    db.products.update(query,{$set:{_random_sample::Math.random()}},false,false); 
}

編輯

要知道

db.products.update(your_query,{$set:{_random_sample::Math.random()}},false,true)

將無法正常工作，因爲它會更新每個親與您的查詢相匹配的管道相同隨機數。最後一種辦法工作得更好（更新一些文件爲您檢索它們）

來源

2014-07-17 16:12:23

這是我最終想出了：

var numberOfItems = 10; 


// GET LIST OF ALL ID's 
SchemaNameHere.find({}, { '_id': 1 }, function(err, data) { 

    if (err) res.send(err); 

    // shuffle array, as per here https://github.com/coolaj86/knuth-shuffle 
    var arr = shuffle(data.slice(0)); 

    // get only the first numberOfItems of the shuffled array 
    arr.splice(numberOfItems, arr.length - numberOfItems); 

    // new array to store all items 
    var return_arr = []; 

    // use async each, as per here http://justinklemm.com/node-js-async-tutorial/ 
    async.each(arr, function(item, callback) { 

     // get items 1 by 1 and add to the return_arr 
     SchemaNameHere.findById(item._id, function(err, data) { 

      if (err) res.send(err); 
      return_arr.push(data); 

      // go to the next one item, or to the next function if done 
      callback(); 

     }); 

    }, function(err) { 

     // run this when looped through all items in arr 
     res.json(return_arr); 

    }); 

});

來源

2014-08-16 13:03:05 Iladarsda

由於3.2沒有對獲取文檔中隨機抽取一個簡單的方法來自集合：

$ sample 3.2版本中的新功能。

從其輸入中隨機選擇指定數量的文檔。

的$樣品臺的語法如下：

{ $sample: { size: <positive integer> } }

Source: MongoDB Docs

在這種情況下：

db.products.aggregate([{$sample: {size: 10}}]);

來源

2016-04-06 09:53:34

這應該是被接受的答案 – nxmohamad

跳躍並沒有爲我工作了。這裏是我結束了：

var randomDoc = db.getCollection("collectionName").aggregate([ { 
    $match : { 
// criteria to filter matches 
    } 
}, { 
    $sample : { 
     size : 1 
    } 
} ]).result[0];

得到一個隨機的結果，匹配的標準。

來源

2016-09-30 18:22:33 Marc

MongoDB：如何在100個集合中找到10個隨機文檔？

回答

相關問題