2017-02-09 119 views
0

我目前正努力在Excel中用VBA打開一個utf-16編碼的XML文件。題爲EntireFileExcel VBA:打開UTF-16 XML

我現在的字符串變量目前是這樣開始的:

ÿþ<?xml version="1.0" encoding="utf-16"?> 
<Test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> 

正如你可以看到有在一開始一些字符,似乎關閉。

我做得到的字符串變量:

Open PathToFile For Input As #1 
    Do Until EOF(1) 
       Line Input #1, textline 
       EntireFile = EntireFile & textline 

文件是根據記事本UCS-2的Little Endian格式++,但快速搜索低谷互聯網透露,這是微軟相當於UTF-16?

我嘗試了刪除前兩個字符的bruteforce方法,但這留下了一個空字符串。

所有的谷歌搜索結果包括保存一個沒有BOM的XML文件,但那種我正在尋找的是相反的。

感謝您的時間

回答

1

您可以使用Win32 API函數轉換編碼。

Private Declare Function WideCharToMultiByte Lib "kernel32.dll" (_ 
         ByVal CodePage As Long, _ 
         ByVal dwFlags As Long, _ 
         ByVal lpWideCharStr As Long, _ 
         ByVal cchWideChar As Long, _ 
         ByVal lpMultiByteStr As Long, _ 
         ByVal cbMultiByte As Long, _ 
         ByVal lpDefaultChar As Long, _ 
         ByVal lpUsedDefaultChar As Long) As Long 

Private Declare Function MultiByteToWideChar Lib "kernel32.dll" (_ 
         ByVal CodePage As Long, _ 
         ByVal dwFlags As Long, _ 
         ByVal lpMultiByteStr As Long, _ 
         ByVal cbMultiByte As Long, _ 
         ByVal lpWideCharStr As Long, _ 
         ByVal cchWideChar As Long) As Long 

Private Const CP_UTF16 As Long = 1200& 

Private Function ConvertToUTF16(ByRef Source As String) As Byte() 

    Dim Length As Long 
    Dim Pointer As Long 
    Dim Size As Long 
    Dim Buffer() As Byte 

    Length = Len(Source) 
    Pointer = StrPtr(Source) 
    Size = WideCharToMultiByte(CP_UTF16, 0, Pointer, Length, 0, 0, 0, 0) 
    ReDim Buffer(0 To Size - 1) 

    WideCharToMultiByte CP_UTF16, 0, Pointer, Length, VarPtr(Buffer(0)), _ 
     Size, 0, 0 

    ConvertToUTF16 = Buffer 

End Function 

Private Function ConvertFromUTF16(ByRef Source() As Byte) As String 

    Dim Size As Long 
    Dim Pointer As Long 
    Dim Length As Long 
    Dim Buffer As String 

    Size = UBound(Source) - LBound(Source) + 1 
    Pointer = VarPtr(Source(LBound(Source))) 
    Length = MultiByteToWideChar(CP_UTF16, 0, Pointer, Size, 0, 0) 
    Buffer = Space$(Length) 
    MultiByteToWideChar CP_UTF16, 0, Pointer, Size, StrPtr(Buffer), Length 
    ConvertFromUTF16 = Buffer 

End Function 

Private Const CP_UTF16 As Long = 1200&表示代碼頁1200是UTF-16 little andian。

你可以看到所有代碼頁的列表在這裏https://msdn.microsoft.com/de-de/library/windows/desktop/dd317756(v=vs.85).aspx

+0

謝謝回答。我嘗試添加代碼塊作爲模塊,並在完成填充後使用EntireFile從我的代碼中調用函數CovertToUTF16(我公開)。 WideCharToMultiByte給我索引越界。我敢肯定,這是我的一個錯誤,因爲這是一個導入的函數,但我不知道在哪裏。 – celphy

+0

我再次檢查了整個代碼,發現如果Len(Source)沒有返回3000(顯然是錯誤的),導入的函數會按照預期執行。 – celphy