2012-04-05 70 views
0

我正在測試一些NoSQL解決方案,我主要關注讀取性能。今天是MongoDb日。 該測試機器是一款採用四核酷睿至強@ 2.93GHz和8GB內存的虛擬機。MongoDb/C#差的併發讀取性能

我正在測試只有數據庫和單個集合〜100.000文檔。 BSON文檔大小約爲20Kb,或多或少。

管理對象我工作是:

private class Job 
{ 
    public int Id { get; set; } 
    public string OrganizationName { get; set; } 
    public List<string> Categories { get; set; } 
    public List<string> Industries { get; set; } 
    public int Identifier { get; set; } 
    public string Description { get; set; } 
} 

的測試過程:

- 創建100個線程。

-啓動所有主題。

- 每個線程從集合中讀取20個隨機文檔。

這裏的選擇方法,我使用的是:

private static void TestSelectWithCursor(object state) 
{ 
    resetEvent.WaitOne(); 

    MongoCollection jobs = (state as MongoCollection); 
    var q = jobs.AsQueryable<Job>(); 
    Random r = new Random(938432094); 
    List<int> ids = new List<int>(); 
    for (int i = 0; i != 20; ++i) 
    { 
     ids.Add(r.Next(1000, 100000)); 
    } 
    Stopwatch sw = Stopwatch.StartNew(); 
    var subset = from j in q 
       where j.Id.In(ids) 
       select j; 

    int count = 0; 
    foreach (Job job in subset) 
    { 
     count++; 
    } 
    Console.WriteLine("Retrieved {0} documents in {1} ms.", count, sw.ElapsedMilliseconds); 
    ThreadsCount++; 
} 

「計數++」的東西只是假裝檢索光標後,我做的事情,所以請忽略。

無論如何,這個想法是,我覺得在我看來是非常緩慢的閱讀時間。這是一個典型的測試結果:

> 100 threads created. 
> 
> Retrieved 20 documents in 272 ms. Retrieved 20 documents in 522 ms. 
> Retrieved 20 documents in 681 ms. Retrieved 20 documents in 732 ms. 
> Retrieved 20 documents in 769 ms. Retrieved 20 documents in 843 ms. 
> Retrieved 20 documents in 1038 ms. Retrieved 20 documents in 1139 ms. 
> Retrieved 20 documents in 1163 ms. Retrieved 20 documents in 1170 ms. 
> Retrieved 20 documents in 1206 ms. Retrieved 20 documents in 1243 ms. 
> Retrieved 20 documents in 1322 ms. Retrieved 20 documents in 1378 ms. 
> Retrieved 20 documents in 1463 ms. Retrieved 20 documents in 1507 ms. 
> Retrieved 20 documents in 1530 ms. Retrieved 20 documents in 1557 ms. 
> Retrieved 20 documents in 1567 ms. Retrieved 20 documents in 1617 ms. 
> Retrieved 20 documents in 1626 ms. Retrieved 20 documents in 1659 ms. 
> Retrieved 20 documents in 1666 ms. Retrieved 20 documents in 1687 ms. 
> Retrieved 20 documents in 1711 ms. Retrieved 20 documents in 1731 ms. 
> Retrieved 20 documents in 1763 ms. Retrieved 20 documents in 1839 ms. 
> Retrieved 20 documents in 1854 ms. Retrieved 20 documents in 1887 ms. 
> Retrieved 20 documents in 1906 ms. Retrieved 20 documents in 1946 ms. 
> Retrieved 20 documents in 1962 ms. Retrieved 20 documents in 1967 ms. 
> Retrieved 20 documents in 1969 ms. Retrieved 20 documents in 1977 ms. 
> Retrieved 20 documents in 1996 ms. Retrieved 20 documents in 2005 ms. 
> Retrieved 20 documents in 2009 ms. Retrieved 20 documents in 2025 ms. 
> Retrieved 20 documents in 2035 ms. Retrieved 20 documents in 2066 ms. 
> Retrieved 20 documents in 2093 ms. Retrieved 20 documents in 2111 ms. 
> Retrieved 20 documents in 2133 ms. Retrieved 20 documents in 2147 ms. 
> Retrieved 20 documents in 2150 ms. Retrieved 20 documents in 2152 ms. 
> Retrieved 20 documents in 2155 ms. Retrieved 20 documents in 2160 ms. 
> Retrieved 20 documents in 2166 ms. Retrieved 20 documents in 2196 ms. 
> Retrieved 20 documents in 2202 ms. Retrieved 20 documents in 2254 ms. 
> Retrieved 20 documents in 2256 ms. Retrieved 20 documents in 2262 ms. 
> Retrieved 20 documents in 2263 ms. Retrieved 20 documents in 2285 ms. 
> Retrieved 20 documents in 2326 ms. Retrieved 20 documents in 2336 ms. 
> Retrieved 20 documents in 2337 ms. Retrieved 20 documents in 2350 ms. 
> Retrieved 20 documents in 2372 ms. Retrieved 20 documents in 2384 ms. 
> Retrieved 20 documents in 2412 ms. Retrieved 20 documents in 2426 ms. 
> Retrieved 20 documents in 2457 ms. Retrieved 20 documents in 2473 ms. 
> Retrieved 20 documents in 2521 ms. Retrieved 20 documents in 2528 ms. 
> Retrieved 20 documents in 2604 ms. Retrieved 20 documents in 2659 ms. 
> Retrieved 20 documents in 2670 ms. Retrieved 20 documents in 2687 ms. 
> Retrieved 20 documents in 2961 ms. Retrieved 20 documents in 3234 ms. 
> Retrieved 20 documents in 3434 ms. Retrieved 20 documents in 3440 ms. 
> Retrieved 20 documents in 3452 ms. Retrieved 20 documents in 3466 ms. 
> Retrieved 20 documents in 3502 ms. Retrieved 20 documents in 3524 ms. 
> Retrieved 20 documents in 3561 ms. Retrieved 20 documents in 3611 ms. 
> Retrieved 20 documents in 3652 ms. Retrieved 20 documents in 3655 ms. 
> Retrieved 20 documents in 3666 ms. Retrieved 20 documents in 3711 ms. 
> Retrieved 20 documents in 3742 ms. Retrieved 20 documents in 3821 ms. 
> Retrieved 20 documents in 3850 ms. Retrieved 20 documents in 4020 ms. 
> Retrieved 20 documents in 5143 ms. Retrieved 20 documents in 6607 ms. 
> Retrieved 20 documents in 6630 ms. Retrieved 20 documents in 6633 ms. 
> Retrieved 20 documents in 6637 ms. Retrieved 20 documents in 6639 ms. 
> Retrieved 20 documents in 6801 ms. Retrieved 20 documents in 9302 ms. 

底線是,我期待得到比這更快的閱讀時間。我仍然認爲我做錯了什麼。 不知道我現在可以提供哪些其他信息,但如果有什麼錯過,請讓我知道。

我還包括,希望對大家有所幫助,則說明()跟蹤在由測試執行的查詢之一:

{ 
     "cursor" : "BtreeCursor _id_ multi", 
     "nscanned" : 39, 
     "nscannedObjects" : 20, 
     "n" : 20, 
     "millis" : 0, 
     "nYields" : 0, 
     "nChunkSkips" : 0, 
     "isMultiKey" : false, 
     "indexOnly" : false, 
     "indexBounds" : { 
       "_id" : [ 
         [ 
           3276, 
           3276 
         ], 
         [ 
           8257, 
           8257 
         ], 
         [ 
           11189, 
           11189 
         ], 
         [ 
           21779, 
           21779 
         ], 
         [ 
           22293, 
           22293 
         ], 
         [ 
           23376, 
           23376 
         ], 
         [ 
           28656, 
           28656 
         ], 
         [ 
           29557, 
           29557 
         ], 
         [ 
           32160, 
           32160 
         ], 
         [ 
           34833, 
           34833 
         ], 
         [ 
           35922, 
           35922 
         ], 
         [ 
           39141, 
           39141 
         ], 
         [ 
           49094, 
           49094 
         ], 
         [ 
           54554, 
           54554 
         ], 
         [ 
           67684, 
           67684 
         ], 
         [ 
           76384, 
           76384 
         ], 
         [ 
           85612, 
           85612 
         ], 
         [ 
           85838, 
           85838 
         ], 
         [ 
           91634, 
           91634 
         ], 
         [ 
           99891, 
           99891 
         ] 
       ] 
     } 
} 

如果您有任何想法,然後我會最急於閱讀它。 提前謝謝!

馬塞爾

+0

那麼,到目前爲止沒有運氣...有什麼新鮮的想法? – 2012-04-09 17:13:01

+0

嗨馬塞蘭,我正在看一看。爲了徹底,你提到你也在測試其他一些。你是否針對這些測試運行了此測試並且經歷了不同的結果 – 2012-04-09 19:51:44

+0

我試圖堅持與Mongo。我測試過RavenDb,但速度太慢。我也測試過CouchDb,看起來更慢。無論如何,我想使用MongoDb,因爲其他人只有一個REST接口,它不如Mongo的二進制協議(我認爲)快。 – 2012-04-10 09:56:55

回答

2

我懷疑「在」(通用改性劑)的是迫使順序掃描與每個文檔的完整提取查詢的where子句,繞過使用_id索引的效率。考慮到隨機數可以是相當分佈的,我的猜測是每個線程/查詢都基本掃描整個數據庫。

我建議嘗試一些事情。 (1)查詢單獨爲每個20個文檔由個體單ID (2)考慮使用MongoCursor和使用說明,以獲取有關索引的使用信息查詢

祝福,

加里

PS線程時間似乎表明在工作中也有一些線程調度效果。

+0

謝謝加里!但是,單獨檢索每個文檔並沒有改善結果。 In子句的explain()輸出如上所示。正如你所看到的,一個線程不會導致整個數據庫被掃描(nscanned非常低)。但是我認爲你是正確的,即擁有大量訪問隨機文檔的線程會導致數據庫被掃描。一次不是全部,而是100次少量。我也會嘗試MongoDb論壇。 – 2012-04-06 06:31:53