重新實現ToUpper（）

如果ToUpper（）不存在，你會如何編寫它？對於國際化和本地化的獎勵積分重新實現ToUpper（）

好奇這個引發：http://thedailywtf.com/Articles/The-Long-Way-toUpper.aspx

2008-12-02 Colin Pickard

我下載Unicode表
我導入的表到數據庫
我寫的方法上（）。

下面是一個簡單的實現;）

public static String upper(String s) { 
    if (s == null) { 
     return null; 
    } 

    final int N = s.length(); // Mind the optimization! 
    PreparedStatement stmtName = null; 
    PreparedStatement stmtSmall = null; 
    ResultSet rsName = null; 
    ResultSet rsSmall = null; 
    StringBuilder buffer = new StringBuilder (N); // Much faster than StringBuffer! 
    try { 
     conn = DBFactory.getConnection(); 
     stmtName = conn.prepareStatement("select name from unicode.chart where codepoint = ?"); 
     // TODO Optimization: Maybe move this in the if() so we don't create this 
     // unless there are uppercase characters in the string. 
     stmtSmall = conn.prepareStatement("select codepoint from unicode.chart where name = ?"); 
     for (int i=0; i<N; i++) { 
      int c = s.charAt(i); 
      stmtName.setInt(1, c); 
      rsName = stmtName.execute(); 
      if (rsName.next()) { 
       String name = rsName.getString(1); 
       if (name.contains(" SMALL ")) { 
        name = name.replaceAll(" SMALL ", " CAPITAL "); 

        stmtSmall.setString(1, name); 
        rsSmall = stmtSmall.execute(); 
        if (rsSmall.next()) { 
         c = rsSmall.getInt(1); 
        } 

        rsSmall = DBUtil.close(rsSmall); 
       } 
      } 
      rsName = DBUtil.close(rsName); 
     } 
    } 
    finally { 
     // Always clean up 
     rsSmall = DBUtil.close(rsSmall); 
     rsName = DBUtil.close(rsName); 
     stmtSmall = DBUtil.close(stmtSmall); 
     stmtName = DBUtil.close(stmtName); 
    } 

    // TODO Optimization: Maybe read the table once into RAM at the start 
    // Would waste a lot of memory, though :/ 
    return buffer.toString(); 
}

;）

注：Unicode的圖表，你可以找到關於unicode.org包含字符/碼點的名稱。該字符串將包含大寫字符的「SMALL」（注意空格或可能匹配「SMALLER」等）。現在，您可以搜索類似的名稱，將「小」改爲「大寫」。如果你找到它，你已經找到了帽子版本。

來源

2008-12-02 14:41:54

我不這麼認爲可以處理Unicode表的大小，在一個帖子:)

不幸的是，它不是那麼容易只是char.ToUpper（）每個字符。

例子：

(string-upcase "Straße") ⇒ "STRASSE" 
(string-downcase "Straße") ⇒ "straße" 
(string-upcase "ΧΑΟΣ")  ⇒ "ΧΑΟΣ" 
(string-downcase "ΧΑΟΣ") ⇒ "χαος" 
(string-downcase "ΧΑΟΣΣ") ⇒ "χαοσς" 
(string-downcase "ΧΑΟΣ Σ") ⇒ "χαος σ" 
(string-upcase "χαος")  ⇒ "ΧΑΟΣ" 
(string-upcase "χαοσ")  ⇒ "ΧΑΟΣ"

來源

2008-12-02 13:49:51 leppie

（string-upcase「Straße」）⇒「STRAẞE」 – hangy 2008-12-02 15:17:47

Hangy，對不起，沒有渲染。此外，我的轉換是獨立於本地的（我猜應該提到這一點; p）。 – leppie 2008-12-02 16:04:46

我只是從R6RS Scheme規範粘貼，它可能是一個錯字，將檢查測試。 – leppie 2008-12-02 16:05:51

我不會贏得加分，但在這裏它是7位ASCII：

char toupper(char c) 
{ 
    if ((c < 'a') || (c > 'z')) { return c; } 
    else { return c & 0xdf; } 
}

來源

2008-12-02 13:50:06

在Python

touppe_map = { massive dictionary to handle all cases in all languages } 
def to_upper(c): 
    return toupper_map.get(c, c)

或者，如果你想這樣做「錯誤的方式」

def to_upper(c): 
    for k,v in toupper_map.items(): 
    if k == c: return v 
    return c

來源

2008-12-02 14:02:00 hasen

讓我爲諸如希伯來語，阿拉伯語，格魯吉亞語以及其他沒有大寫（大寫）字母的語言提供更多的獎勵積分。 :-)

來源

2008-12-02 14:16:21

沒有靜態表就足夠了，因爲在知道正確的轉換之前需要知道該語言。

例如在土耳其語i需要去İ（U + 0130），而在任何其他語言是需要去I（U + 0049）。和i是相同的字符U + 0069。

來源

2008-12-02 14:25:35

重新實現ToUpper（）

回答

相關問題