C中的重音/混音字符？

我剛剛學習了C，並得到了一份任務，我們必須將純文本翻譯成莫爾斯電碼並返回。（我大多熟悉Java，因此對我使用的術語感到滿意）。C中的重音/混音字符？

要做到這一點，我有一個所有字母的字符串數組。

char *letters[] = { 
".- ", "-... ", "-.-. ", "-.. ", ".", "..-." etc

我寫了一個函數，用於返回所需字母的位置。

int letter_nr(unsigned char c) 
{ 
    return c-97; 
}

這是行得通的，但分配規範要求處理瑞典默認字母åäö。瑞典字母和英文一樣，最後還有這三個字母。我試圖檢查這些，就像這樣：

int letter_nr(unsigned char c) 
{ 
    if (c == 'å') 
     return 26; 
    if (c == 'ä') 
     return 27; 
    if (c == 'ö') 
     return 28; 
    return c-97; 
}

不幸的是，當我試圖測試這個功能，我得到了相同的值所有這些三：98這是我的主，測試功能：

int main() 
{ 
    unsigned char letter; 

    while(1) 
    { 
     printf("Type a letter to get its position: "); 
     scanf("%c", &letter); 
     printf("%d\n", letter_nr(letter)); 
    } 
    return 0; 
}

我能做些什麼來解決這個問題？

來源

2009-11-12 pg-robban

什麼編譯器和OS ？ – 2009-11-12 20:41:56

XCode（Mac OS X）。 – 2009-11-12 21:01:05

我也有OS X.而且我的波蘭語字母只有同樣的問題:) – 2009-11-12 21:09:27

在一般的編碼的東西是相當複雜的。在另一方面，如果你只是想要一個骯髒的解決方案具體到你的編譯器/平臺不是添加這樣的事情代碼：

printf("letter 0x%x is number %d\n", letter, letter_nr(letter));

它會給你的變音符號的十六進制值。不僅僅是用if來代替你的信件的號碼。

編輯你說你總是得到98，所以你的scanf從控制檯得到98 + 97 = 195 = 0x3C。根據這個table 0x3C是UTF8序列的開始普通拉丁文小寫字母N有東西在Latin1 block。你在Mac OS X？

編輯這是我最後的電話。很兩輪牛車，但它爲我工作:)

#include <stdio.h> 

// scanf for for letter. Return position in Morse Table. 
// Recognises UTF8 for swedish letters. 
int letter_nr() 
{ 
    unsigned char letter; 
    // scan for the first time, 
    scanf("%c", &letter); 
    if(0xC3 == letter) 
    { 
    // we scanf again since this is UTF8 and two byte encoded character will come 
    scanf("%c", &letter); 
    //LATIN SMALL LETTER A WITH RING ABOVE = å 
    if(0xA5 == letter) 
     return 26; 
    //LATIN SMALL LETTER A WITH DIAERESIS = ä 
    if(0xA4 == letter) 
     return 27; 
    // LATIN SMALL LETTER O WITH DIAERESIS = ö 
    if(0xB6 == letter) 
     return 28; 

    printf("Unknown letter. 0x%x. ", letter); 
    return -1; 
    } 
    // is seems to be regular ASCII 
    return letter - 97; 
} // letter_nr 

int main() 
{ 
    while(1) 
    { 
     printf("Type a letter to get its position: "); 

     int val = letter_nr(); 
     if(-1 != val) 
      printf("Morse code is %d.\n", val); 
     else 
      printf("Unknown Morse code.\n"); 

     // strip remaining new line 
    unsigned char new_line; 
    scanf("%c", &new_line);   
    } 
    return 0; 
}

來源

2009-11-12 20:34:39

不幸的是，這似乎給我和以前一樣的問題：我得到這三個字母相同的十六進制值。 – 2009-11-12 20:39:15

你能解釋一下你從哪裏得到信件嗎？我應該把它作爲一個全局變量並將讀數傳遞給letter_nr函數嗎？ – 2009-11-12 21:21:33

這篇文章顯示了UTF-8的深刻無知，以及一般的編碼。這只是錯誤的：兩個字節的總和不是unicode代碼點。 -1 – gnud 2009-11-12 21:35:07

字符常量的編碼實際上取決於您的語言環境設置。

最安全的選擇是使用寬字符和相應的功能。您聲明的字母表爲const wchar_t* alphabet = L"abcdefghijklmnopqrstuvwxyzäöå"，並且個別字符爲L'ö';

這個小示例程序適用於我（也適用於使用UTF-8的UNIX控制檯） - 請嘗試。

#include <stdlib.h> 
#include <stdio.h> 
#include <wchar.h> 
#include <locale.h> 

int main(int argc, char** argv) 
{ 
    wint_t letter = L'\0'; 
    setlocale(LC_ALL, ""); /* Initialize locale, to get the correct conversion to/from wchars */ 
    while(1) 
    { 
     if(!letter) 
      printf("Type a letter to get its position: "); 

     letter = fgetwc(stdin); 
     if(letter == WEOF) { 
     putchar('\n'); 
     return 0; 
     } else if(letter == L'\n' || letter == L'\r') { 
     letter = L'\0'; /* skip newlines - and print the instruction again*/ 
     } else { 
     printf("%d\n", letter); /* print the character value, and don't print the instruction again */ 
     } 
    } 
    return 0; 
}

舉例會議：

Type a letter to get its position: a 
97 
Type a letter to get its position: A 
65 
Type a letter to get its position: Ö 
214 
Type a letter to get its position: ö 
246 
Type a letter to get its position: Å 
197 
Type a letter to get its position: <^D>

據我所知，在Windows上，這不符合Unicode的BMP之外的字符工作，但在這裏，這不是一個問題。

來源

2009-11-12 20:32:59 gnud

他在Mac OS X上。所以console是UTF8 ready，所以locale不會影響他的編碼。 – 2009-11-12 21:15:19

當然，平臺很重要 - 'ö'不適合UTF-8中的一個字節，因此您無法將其作爲字符常量進行比較。 – gnud 2009-11-12 21:30:19

我最喜歡這個，因爲它似乎在工作。然而，它給了我兩張照片，顯然是一個用於變音符（195），然後是另一個，我認爲它是字母代碼。 – 2009-11-12 22:10:58

嗯...首先我想說的「滑稽」字符不char秒。你不能將其中的一個傳遞給一個接受char參數的函數，並期望它能夠工作。

試試這個（加入剩餘位）：

char buf[100]; 
printf("Enter a string with funny characters: "); 
fflush(stdout); 
fgets(buf, sizeof buf, stdin); 
/* now print it, as if it was a sequence of `char`s */ 
char *p = buf; 
while (*p) { 
    printf("The character '%c' has value %d\n", *p, *p); 
    p++; 
}

現在嘗試用相同的寬字符：#include <wchar.h>和替換printf與wprintf，fgets與fgetws，等...

來源

2009-11-12 22:02:10 pmg

C中的重音/混音字符？

回答

相關問題