2015-11-23 26 views
10

我在瀏覽PluralizationService的來源時,發現了一些奇怪的東西。在課堂上有一些反映不同多元化規則的私人詞典。例如:.NET參考源代碼中的四個破折號的組是什麼?

private string[] _uninflectiveWordList = 
     new string[] { 
      "bison", "flounder", "pliers", "bream", "gallows", "proceedings", 
      "breeches", "graffiti", "rabies", "britches", "headquarters", "salmon", 
      "carp", "----", "scissors", "ch----is", "high-jinks", "sea-bass", 
      "clippers", "homework", "series", "cod", "innings", "shears", "contretemps", 
      "jackanapes", "species", "corps", "mackerel", "swine", "debris", "measles", 
      "trout", "diabetes", "mews", "tuna", "djinn", "mumps", "whiting", "eland", 
      "news", "wildebeest", "elk", "pincers", "police", "hair", "ice", "chaos", 
      "milk", "cotton", "pneumonoultramicroscopicsilicovolcanoconiosis", 
      "information", "aircraft", "scabies", "traffic", "corn", "millet", "rice", 
      "hay", "----", "tobacco", "cabbage", "okra", "broccoli", "asparagus", 
      "lettuce", "beef", "pork", "venison", "mutton", "cattle", "offspring", 
      "molasses", "shambles", "shingles"}; 

什麼是字符串中的四個破折號的組?我沒有看到它們在代碼中被處理,所以它們不是某種模板。我能想到的唯一的事情就是那些被審查的咒語('ch ----'將是'底盤'),在這種情況下它實際上會損害可讀性。有沒有人遇到過這個?如果我對實際的完整列表感興趣,我將如何看待它?

+0

不知道確定的,但我的猜測是它是某種佔位符作爲通配符(例如,匹配由ch組成的模式,然後是4個字符,然後匹配)。 –

+4

*「pneumonoultramicroscopicsilicovolcanoconiosis」*我猜測測試者發現一個人從這個bug報告中得到了很好的笑聲,而修復它的開發人員則笑了起來......(根據維基百科,它是英文中最長的詞) –

+0

我最好的猜測是一個模式匹配,其中字母本身並不重要,但長度確實如此:例如:貓,帽子,蝙蝠,如果它與其他案例不匹配,可能會被集中在短劃線模式中,相同。只是一個猜測。 –

回答

5

從使用Reflector來查看反編譯的代碼,我可以驗證編譯版本在那裏沒有「----」,它確實似乎是某處某處的審查。該反編譯的代碼有這樣的構造:

this._uninflectiveWordList = new string[] { 
    "bison", "flounder", "pliers", "bream", "gallows", "proceedings", "breeches", "graffiti", "rabies", "britches", "headquarters", "salmon", "carp", "herpes", "scissors", "chassis", 
    "high-jinks", "sea-bass", "clippers", "homework", "series", "cod", "innings", "shears", "contretemps", "jackanapes", "species", "corps", "mackerel", "swine", "debris", "measles", 
    "trout", "diabetes", "mews", "tuna", "djinn", "mumps", "whiting", "eland", "news", "wildebeest", "elk", "pincers", "police", "hair", "ice", "chaos", 
    "milk", "cotton", "pneumonoultramicroscopicsilicovolcanoconiosis", "information", "aircraft", "scabies", "traffic", "corn", "millet", "rice", "hay", "hemp", "tobacco", "cabbage", "okra", "broccoli", 
    "asparagus", "lettuce", "beef", "pork", "venison", "mutton", "cattle", "offspring", "molasses", "shambles", "shingles" 
}; 

正如你所看到的截尾詞是「皰疹」,「底盤」和「麻」(如果我正確遵循一起)。我個人認爲這些都不需要審查,這表明它是一種自動化系統。我會假定原始源代碼中有它們,而不是在某種預編譯合併中添加它們(如果沒有其他內容,因爲「----」實際上不足以說明應該替換的內容)。我想象出於某種原因,參考網站會對他們進行審查。

Hans Passant在評論中也回答了一個非常類似的問題:What does ----s mean in the context of StringBuilder.ToString()?。這解釋了「發佈的參考源的源代碼被推送通過一個過濾器,從源中刪除令人反感的內容」。

+0

屁股,不是底盤,它可能會讓某人臉紅 –

+3

你是對的,「屁股」是被刪除的,我指的是完整的單詞是什麼 – Chris

+4

所以這是clbuttic過濾做得不好嗎? –

相關問題