這是我的解決方案。我的目標不是提供最簡單的解決方案,而是可以採用各種(有時候很奇怪的)名稱格式,並在首字母和姓氏首字母(或在單身人士的情況下)生成最佳猜測的單個首字母。
我也嘗試用一種相對國際友好的方式編寫它,使用unicode正則表達式,雖然我沒有任何生成多種外部名稱(例如中文)的縮寫的經驗,但它應該在最少生成一些可用於表示此人的內容,但不得少於兩個字符。例如,給它起一個像韓國名字的「행운의복숭아」就會產生행복,正如你可能預料的那樣(儘管在韓國文化中這可能不是正確的做法)。
/// <summary>
/// Given a person's first and last name, we'll make our best guess to extract up to two initials, hopefully
/// representing their first and last name, skipping any middle initials, Jr/Sr/III suffixes, etc. The letters
/// will be returned together in ALL CAPS, e.g. "TW".
///
/// The way it parses names for many common styles:
///
/// Mason Zhwiti -> MZ
/// mason lowercase zhwiti -> MZ
/// Mason G Zhwiti -> MZ
/// Mason G. Zhwiti -> MZ
/// John Queue Public -> JP
/// John Q. Public, Jr. -> JP
/// John Q Public Jr. -> JP
/// Thurston Howell III -> TH
/// Thurston Howell, III -> TH
/// Malcolm X -> MX
/// A Ron -> AR
/// A A Ron -> AR
/// Madonna -> M
/// Chris O'Donnell -> CO
/// Malcolm McDowell -> MM
/// Robert "Rocky" Balboa, Sr. -> RB
/// 1Bobby 2Tables -> BT
/// Éric Ígor -> ÉÍ
/// 행운의 복숭아 -> 행복
///
/// </summary>
/// <param name="name">The full name of a person.</param>
/// <returns>One to two uppercase initials, without punctuation.</returns>
public static string ExtractInitialsFromName(string name)
{
// first remove all: punctuation, separator chars, control chars, and numbers (unicode style regexes)
string initials = Regex.Replace(name, @"[\p{P}\p{S}\p{C}\p{N}]+", "");
// Replacing all possible whitespace/separator characters (unicode style), with a single, regular ascii space.
initials = Regex.Replace(initials, @"\p{Z}+", " ");
// Remove all Sr, Jr, I, II, III, IV, V, VI, VII, VIII, IX at the end of names
initials = Regex.Replace(initials.Trim(), @"\s+(?:[JS]R|I{1,3}|I[VX]|VI{0,3})$", "", RegexOptions.IgnoreCase);
// Extract up to 2 initials from the remaining cleaned name.
initials = Regex.Replace(initials, @"^(\p{L})[^\s]*(?:\s+(?:\p{L}+\s+(?=\p{L}))?(?:(\p{L})\p{L}*)?)?$", "$1$2").Trim();
if (initials.Length > 2)
{
// Worst case scenario, everything failed, just grab the first two letters of what we have left.
initials = initials.Substring(0, 2);
}
return initials.ToUpperInvariant();
}
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/ –
請提供預期的輸出。 –
看着它看起來像它不是一個'使用正則表達式將不能解決問題.. – Anirudha