在XML文檔和Mozilla通用字符集檢測器(UCSD)的不同實現上,出現了一個BOM序列,其中字節順序或字順序顛倒了,但不是兩個和他們稱之爲 '不尋常的八位位組序':什麼是不尋常的八位字節訂單BOM
F.1 Detection Without External Encoding Information
...
00 00 FF FE UCS-4, unusual octet order (2143)
FE FF 00 00 UCS-4, unusual octet order (3412)
Universal Character Set Detector (UCSD) source(只是一個例子):
if (('\xFF' == aBuf[1]) && ('\x00' == aBuf[2]) && ('\x00' == aBuf[3]))
// FE FF 00 00 UCS-4, unusual octet order BOM (3412)
mDetectedCharset = "X-ISO-10646-UCS-4-3412";
else if (('\x00' == aBuf[1]) && ('\xFF' == aBuf[2]) && ('\xFE' == aBuf[3]))
// 00 00 FF FE UCS-4, unusual octet order BOM (2143)
mDetectedCharset = "X-ISO-10646-UCS-4-2143";
Universal Character Set Detector (UCSD) docs:
Known character sets
...
X-ISO-10646-UCS-4-2143
X-ISO-10646-UCS-4-3412
是否有使用該端序存在任何硬件,會出現這樣的編碼 或爲它的ISO標準,有沒有支持編碼任何流行的庫/解碼嗎? 爲什麼這些序列不像其他任何無效序列一樣被忽略?