2011-07-25 92 views
24

我曾經對Windows編程,但我想嘗試一下製作跨平臺應用程序。和我有一些問題,如果你不介意:打開一切 - 這可能嗎?

問題1

是否有某種方式來打開UNICODE \ ASCII文件,並自動檢測它的使用裸ANSI C. MSDN編碼說的fopen( )如果我將使用「ccs = UNICODE」標誌,可以在各種UNICODE格式(utf-8,utf-16,UNICODE BI \ LI)之間切換。實驗發現,從UNICODE切換到ASCII不會發生,但試圖解決這個問題,我發現文本Unicode文件有一些前綴,如0xFFFE,0xFEFF或0xFEBB。

FILE *file; 
{ 
__int16 isUni; 
file = _tfopen(filename, _T("rb")); 
fread(&(isUni),1,2,file); 
fclose(file); 
if(isUni == (__int16)0xFFFE || isUni == (__int16)0xFEFF || isUni == (__int16)0xFEBB) 
    file = _tfopen(filename, _T("r,ccs=UNICODE")); 
else 
    file = _tfopen(filename, _T("r"));   
} 

那麼,我可以做出這樣的跨平臺而不是那麼難看嗎?

問題2

我可以做這樣的事情的窗口,但它會在Linux下工作嗎?

file = fopen(filename, "r"); 
fwscanf(file,"%lf",buffer); 

如果不是,那麼是否有某種ANSI C函數將ASCII字符串轉換爲Unicode?我想在我的程序中使用Unicode字符串。

問題3

此外,我需要輸出Unicode字符串到控制檯。在windows中有setlocale(*),但是我應該在Linux中做什麼?看來控制檯已經是Unicode了。

問題4

一般來說,我想在我的程序使用Unicode工作,但我遇到一些奇怪的問題:

f = fopen("inc.txt","rt"); 
fwprintf(f,L"Текст");   // converted successfully 
fclose(f); 
f = fopen("inc_u8.txt","rt, ccs = UNICODE"); 
fprintf(f,"text");    // failed to convert 
fclose(f); 

附:有沒有關於跨平臺編程的好書,與Windows和Linux程序代碼的比較?還有一些關於使用Unicode的方法,實用的方法,也就是。我不想沉浸在簡單的UNICODE BI \ LI歷史中,我對特定的C/C++庫感興趣。

+0

好討論unicode檢測的問題在這裏:http://blogs.msdn.com/b/oldnewthing/archive/2007/04/17/2158334.aspx –

+3

我認爲ccs =任何東西都不是標準的,所以它贏得了' t是可移植的 – ShinTakezou

+0

ANSI C不支持UNICODE,它支持wchar_t,但wchar_t不是UNICODE,因此 - > no way – user411313

回答

2

問題1:

是的,你可以檢測字節順序標記,這是你發現的字節序列 - 如果你的文件有一個。
在Google和stackoverflow上搜索將會完成剩下的工作。至於「不那麼難看」:你可以重構/美化你的代碼,例如編寫一個確定BOM的函數,並在開始時執行它,然後根據需要調用fopen或_tfopen。 然後你可以重構一遍,然後編寫你自己的fopen函數。但它仍然是醜陋的。

問題2:

是的,但Unicode函數並不總是叫同在Linux上,因爲它們是在Windows上。
使用定義。 也許寫你自己的TCHAR。^ h

問題3:

#include <locale.h> 
setlocale(LC_ALL, "en.UTF-8") 

男子3的setlocale

問題4:
只需使用fwprintf。
另一個不是標準。

您可以使用wxWidgets工具包。
它使用unicode,它使用在Windows和Linux以及Unix和Mac上具有相同實現的類。

對你來說更好的問題是你如何將ASCII轉換爲Unicode,反之亦然。 這是這樣的:

std::string Unicode2ASCII(std::wstring wstrStringToConvert) 
{ 
    size_t sze_StringLength = wstrStringToConvert.length() ; 

    if(0 == sze_StringLength) 
     return "" ; 

    char* chrarry_Buffer = new char[ sze_StringLength + 1 ] ; 
    wcstombs(chrarry_Buffer, wstrStringToConvert.c_str(), sze_StringLength) ; // Unicode2ASCII, const wchar_t* C-String 2 mulibyte C-String 
    chrarry_Buffer[sze_StringLength] = '\0'  ; 
    std::string strASCIIstring = chrarry_Buffer ; 
    delete chrarry_Buffer ; 

    return strASCIIstring ; 
} 


std::wstring ASCII2Unicode(std::string strStringToConvert) 
{ 
    size_t sze_StringLength = strStringToConvert.length() ; 

    if(0 == sze_StringLength) 
     return L"" ; 

    wchar_t* wchrarry_Buffer = new wchar_t[ sze_StringLength + 1 ] ; 
    mbstowcs(wchrarry_Buffer, strStringToConvert.c_str(), sze_StringLength) ; // Unicode2ASCII, const. mulibyte C-String 2 wchar_t* C-String 
    wchrarry_Buffer[sze_StringLength] = L'\0' ; 
    std::wstring wstrUnicodeString = wchrarry_Buffer ; 
    delete wchrarry_Buffer ; 

    return wstrUnicodeString ; 
} 

編輯: 這裏是一些洞察到可用的Unicode功能在Linux(wchar.h):

__BEGIN_NAMESPACE_STD 
/* Copy SRC to DEST. */ 
extern wchar_t *wcscpy (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src) __THROW; 
/* Copy no more than N wide-characters of SRC to DEST. */ 
extern wchar_t *wcsncpy (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src, size_t __n) 
    __THROW; 

/* Append SRC onto DEST. */ 
extern wchar_t *wcscat (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src) __THROW; 
/* Append no more than N wide-characters of SRC onto DEST. */ 
extern wchar_t *wcsncat (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src, size_t __n) 
    __THROW; 

/* Compare S1 and S2. */ 
extern int wcscmp (__const wchar_t *__s1, __const wchar_t *__s2) 
    __THROW __attribute_pure__; 
/* Compare N wide-characters of S1 and S2. */ 
extern int wcsncmp (__const wchar_t *__s1, __const wchar_t *__s2, size_t __n) 
    __THROW __attribute_pure__; 
__END_NAMESPACE_STD 

#ifdef __USE_XOPEN2K8 
/* Compare S1 and S2, ignoring case. */ 
extern int wcscasecmp (__const wchar_t *__s1, __const wchar_t *__s2) __THROW; 

/* Compare no more than N chars of S1 and S2, ignoring case. */ 
extern int wcsncasecmp (__const wchar_t *__s1, __const wchar_t *__s2, 
      size_t __n) __THROW; 

/* Similar to the two functions above but take the information from 
    the provided locale and not the global locale. */ 
# include <xlocale.h> 

extern int wcscasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2, 
      __locale_t __loc) __THROW; 

extern int wcsncasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2, 
       size_t __n, __locale_t __loc) __THROW; 
#endif 


/* Special versions of the functions above which take the locale to 
    use as an additional parameter. */ 
extern long int wcstol_l (__const wchar_t *__restrict __nptr, 
       wchar_t **__restrict __endptr, int __base, 
       __locale_t __loc) __THROW; 

extern unsigned long int wcstoul_l (__const wchar_t *__restrict __nptr, 
        wchar_t **__restrict __endptr, 
        int __base, __locale_t __loc) __THROW; 

__extension__ 
extern long long int wcstoll_l (__const wchar_t *__restrict __nptr, 
       wchar_t **__restrict __endptr, 
       int __base, __locale_t __loc) __THROW; 

__extension__ 
extern unsigned long long int wcstoull_l (__const wchar_t *__restrict __nptr, 
         wchar_t **__restrict __endptr, 
         int __base, __locale_t __loc) 
    __THROW; 

extern double wcstod_l (__const wchar_t *__restrict __nptr, 
      wchar_t **__restrict __endptr, __locale_t __loc) 
    __THROW; 

extern float wcstof_l (__const wchar_t *__restrict __nptr, 
       wchar_t **__restrict __endptr, __locale_t __loc) 
    __THROW; 

extern long double wcstold_l (__const wchar_t *__restrict __nptr, 
        wchar_t **__restrict __endptr, 
        __locale_t __loc) __THROW; 


/* Copy SRC to DEST, returning the address of the terminating L'\0' in 
    DEST. */ 
extern wchar_t *wcpcpy (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src) __THROW; 

/* Copy no more than N characters of SRC to DEST, returning the address of 
    the last character written into DEST. */ 
extern wchar_t *wcpncpy (wchar_t *__restrict __dest, 
      __const wchar_t *__restrict __src, size_t __n) 
    __THROW; 
#endif /* use GNU */ 


/* Wide character I/O functions. */ 

#ifdef __USE_XOPEN2K8 
/* Like OPEN_MEMSTREAM, but the stream is wide oriented and produces 
    a wide character string. */ 
extern __FILE *open_wmemstream (wchar_t **__bufloc, size_t *__sizeloc) __THROW; 
#endif 

#if defined __USE_ISOC95 || defined __USE_UNIX98 
__BEGIN_NAMESPACE_STD 

/* Select orientation for stream. */ 
extern int fwide (__FILE *__fp, int __mode) __THROW; 


/* Write formatted output to STREAM. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int fwprintf (__FILE *__restrict __stream, 
      __const wchar_t *__restrict __format, ...) 
    /* __attribute__ ((__format__ (__wprintf__, 2, 3))) */; 
/* Write formatted output to stdout. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int wprintf (__const wchar_t *__restrict __format, ...) 
    /* __attribute__ ((__format__ (__wprintf__, 1, 2))) */; 
/* Write formatted output of at most N characters to S. */ 
extern int swprintf (wchar_t *__restrict __s, size_t __n, 
      __const wchar_t *__restrict __format, ...) 
    __THROW /* __attribute__ ((__format__ (__wprintf__, 3, 4))) */; 

/* Write formatted output to S from argument list ARG. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int vfwprintf (__FILE *__restrict __s, 
       __const wchar_t *__restrict __format, 
       __gnuc_va_list __arg) 
    /* __attribute__ ((__format__ (__wprintf__, 2, 0))) */; 
/* Write formatted output to stdout from argument list ARG. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int vwprintf (__const wchar_t *__restrict __format, 
      __gnuc_va_list __arg) 
    /* __attribute__ ((__format__ (__wprintf__, 1, 0))) */; 
/* Write formatted output of at most N character to S from argument 
    list ARG. */ 
extern int vswprintf (wchar_t *__restrict __s, size_t __n, 
       __const wchar_t *__restrict __format, 
       __gnuc_va_list __arg) 
    __THROW /* __attribute__ ((__format__ (__wprintf__, 3, 0))) */; 


/* Read formatted input from STREAM. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int fwscanf (__FILE *__restrict __stream, 
      __const wchar_t *__restrict __format, ...) 
    /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; 
/* Read formatted input from stdin. 

    This function is a possible cancellation point and therefore not 
    marked with __THROW. */ 
extern int wscanf (__const wchar_t *__restrict __format, ...) 
    /* __attribute__ ((__format__ (__wscanf__, 1, 2))) */; 
/* Read formatted input from S. */ 
extern int swscanf (__const wchar_t *__restrict __s, 
      __const wchar_t *__restrict __format, ...) 
    __THROW /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; 

# if defined __USE_ISOC99 && !defined __USE_GNU \ 
    && (!defined __LDBL_COMPAT || !defined __REDIRECT) \ 
    && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K) 
# ifdef __REDIRECT 
/* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[ 
    GNU extension which conflicts with valid %a followed by letter 
    s, S or [. */ 
extern int __REDIRECT (fwscanf, (__FILE *__restrict __stream, 
       __const wchar_t *__restrict __format, ...), 
       __isoc99_fwscanf) 
    /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; 
extern int __REDIRECT (wscanf, (__const wchar_t *__restrict __format, ...), 
       __isoc99_wscanf) 
    /* __attribute__ ((__format__ (__wscanf__, 1, 2))) */; 
extern int __REDIRECT_NTH (swscanf, (__const wchar_t *__restrict __s, 
        __const wchar_t *__restrict __format, 
        ...), __isoc99_swscanf) 
    /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */; 
# else 
extern int __isoc99_fwscanf (__FILE *__restrict __stream, 
       __const wchar_t *__restrict __format, ...); 
extern int __isoc99_wscanf (__const wchar_t *__restrict __format, ...); 
extern int __isoc99_swscanf (__const wchar_t *__restrict __s, 
       __const wchar_t *__restrict __format, ...) 
+0

非常受歡迎,但是'std :: string'不是C. – mlp

+0

對,但想法是一樣的。使用的函數是純C函數。只需使用malloc而不是wcslen.length,將字符串轉換爲char *,然後將wstring轉換爲wchar_t *,然後在每個地方省略.c_str(),並且使用純C代碼。 –

1

正如我在評論建議,你應該採取查看ICU,這是一個由IBM創建的用於Unicode處理的跨平臺C庫。它通過一個非常強大的String類爲C++和Java提供了額外的支持。它在Android和iOS等很多地方都有使用,所以它非常穩定和成熟。