2016-09-28 18 views
0

我正在研究一個項目,該項目包含包含英文,中文或英文和中文的單元格的多個Excel文件。如何在Excel中將英文文本從中文中分離/過濾

我需要保留完全是中文的行,並把它們放在第一位。然後,我需要中文和英文兩行。只有那些只有英文的人。

我遇到以下3個函數可以幫助我標記相應的內容,但他們似乎沒有按預期工作,我不明白爲什麼。

Function ExtractChn(txt As String) 
Dim i As Integer 
Dim ChnTxt As String 
For i = 1 To Len(txt) 
    If Asc(Mid(txt, i, 1)) < 0 Then 
     ChnTxt = ChnTxt & Mid(txt, i, 1) 
    End If 
Next i 
ExtractChn = ChnTxt 
End Function 

Function ExtractEng(txt As String) 
Dim i As Integer 
Dim EngTxt As String 
For i = 1 To Len(txt) 
    If Asc(Mid(txt, i, 1)) >= 0 Then 
     EngTxt = EngTxt & Mid(txt, i, 1) 
    End If 
Next i 
ExtractEng = EngTxt 
End Function 

Function CheckTxt(txt) 
Dim i As Integer 
Dim Eng As Integer 
Dim Chn As Integer 
Chn = 0 
Eng = 0 
For i = 1 To Len(txt) 
    If Asc(Mid(txt, i, 1)) > 0 Then 
     Eng = 1 
    Else: 
     Chn = 1 
    End If 
Next i 
If Chn = 1 And Eng = 1 Then 'Contains Both Eng & Chn 
    CheckTxt = "BOTH" 
Else: 
    If Chn = 1 And Eng = 0 Then 'Chn 
     CheckTxt = "CHN" 
    Else: 
     If Chn = 0 And Eng = 1 Then 'Eng 
      CheckTxt = "ENG" 
     End If 
    End If 
End If 
End Function 

創建它們的人甚至提供了一個文件來演示函數是如何工作的。我附上的鏈接到具有安排如下文件:

Text|English part of it|Chinese part of it|ExtractEng|ExtractChn|CheckTxt 

根據作者的意圖,在CheckTxt結果應該顯示任何CHENG,或BOTH。但是,它總是隻顯示ENG,我不明白爲什麼。

任何想法如何使其工作?除非有更簡單的方法來「提前過濾」Excel中的內容嗎?任何幫助都感激不盡。

Test Excel file from the developer

+0

由原始開發商編寫的代碼期待您的系統中使用[DBCS代碼頁(https://msdn.microsoft。 COM/EN-US /庫/窗/桌面/ dd317794(v = vs.85)的.aspx)。在這些系統上,[Asc](https://msdn.microsoft.com/en-us/library/office/gg264313.aspx)將返回中文字符的負整數。 –

回答

1

這聽起來像是正則表達式的工作!

enter image description here

Function getCharSet(Target As Range) As String 
    Const ChinesePattern = "[\u4E00-\u9FFF\u6300-\u77FF\u7800-\u8CFF\u8D00-\u9FFF]+" 
    Const EnglishPattern = "[A-Za-z]" 
    Dim results As String 
    Dim Data, v 
    Dim Regex1 As Object 
    Set Regex1 = CreateObject("VBScript.RegExp") 
    Regex1.Global = True 

    If Target.Count = 1 Then 
     Data = Array(Target.Value2) 
    Else 
     Data = Target.Value2 
    End If 

    For Each v In Data 

     If Not InStr(results, "CHN") Then 
      Regex1.Pattern = ChinesePattern 
      If Regex1.Test(v) Then 
       If Len(results) Then 
        getCharSet = "CHN" & " - " & results 
        Exit Function 
       Else 
        results = "CHN" 
       End If 
      End If 
     End If 

     If Not InStr(results, "ENG") Then 
      Regex1.Pattern = EnglishPattern 
      If Regex1.Test(v) Then 
       If Len(results) Then 
        getCharSet = results & " - ENG" 
        Exit Function 
       Else 
        results = "ENG" 
       End If 
      End If 
     End If 
    Next 
    getCharSet = results 

End Function 
1

一個基本的方法:

Sub Main() 

Dim sh As Worksheet 
Set sh = ActiveSheet 

Dim rng As Range 
Set rng = sh.Range("A6:D10") 

Call Separate_English_Chinese(rng) 

End Sub 

Sub Separate_English_Chinese(rng) 

Dim sh As Worksheet 
Set sh = rng.Parent 

Dim EnglishCharacters As String 
Dim colEng As Long, colChn As Long, colContains As Long 
Dim a As String, i As Long, k As Long 
Dim colFullText As Long, txtEnglish As String, txtChinese As String 
Dim Result As Long, Contains As String 
Dim First As Long, Last As Long 

First = rng.Row 
Last = rng.Rows.Count + rng.Row - 1 

EnglishCharacters = "qwertyuiopasdfghjklzxcvbnm" 

EnglishCharacters = UCase(EnglishCharacters) & LCase(EnglishCharacters) 

colFullText = 1 
colEng = 2 
colChn = 3 
colContains = 4 

For i = First To Last 

    a = sh.Cells(i, colFullText).Value 

    txtEnglish = "" 
    txtChinese = "" 

    For k = 1 To Len(a) 

     If InStr(EnglishCharacters, Mid(a, k, 1)) Then 
      txtEnglish = txtEnglish & Mid(a, k, 1) 
     Else 
      txtChinese = txtChinese & Mid(a, k, 1) 
     End If 

    Next 

    sh.Cells(i, colEng).Value = txtEnglish 
    sh.Cells(i, colChn).Value = txtChinese 

    Result = 0 
    If txtEnglish <> "" Then Result = Result + 1 
    If txtChinese <> "" Then Result = Result + 10 

    Select Case Result 

     Case 1 
     Contains = "ENG" 
     Case 10 
     Contains = "CHN" 
     Case 11 
     Contains = "BOTH" 
     Case Else 
     Contains = "" 

    End Select 

    sh.Cells(i, colContains).Value = Contains 

Next 

End Sub