2012-11-27 24 views
0

我試圖將許多舊的.DOC文件轉換爲PDF格式或RTF格式。到目前爲止,我已經找到了一個完成後者(轉換爲RTF),但是舊的Word應用程序的格式仍然存在於文檔中。如果打開Microsoft Word(我正在使用2010),然後單擊文件>打開,則會出現一個下拉菜單,允許您選擇「從任意文件中恢復文本()」。是否有可能在轉換過程中使用它來過濾掉.DOC文檔中的格式數據?下面是一對夫婦的腳本我目前想要修改的例子:VBA在「從任何文件恢復文本」模式下打開.doc

這一項工作,雖然它似乎只能追加到的.rtf文件的末尾,而不是改變格式:

Sub SaveAllAsDOCX() 
Dim strFilename As String 
Dim strDocName As String 
Dim strPath As String 
Dim oDoc As Document 
Dim fDialog As FileDialog 
Dim intPos As Integer 
Set fDialog = Application.FileDialog(msoFileDialogFolderPicker) 
With fDialog 
    .Title = "Select folder and click OK" 
    .AllowMultiSelect = False 
    ..InitialView = msoFileDialogViewList 
    If .Show <> -1 Then 
     MsgBox "Cancelled By User", , "List Folder Contents" 
     Exit Sub 
    End If 
    strPath = fDialog.SelectedItems.Item(1) 
    If Right(strPath, 1) <> "\" Then strPath = strPath + "\" 
End With 
If Documents.Count > 0 Then 
    Documents.Close SaveChanges:=wdPromptToSaveChanges 
End If 
If Left(strPath, 1) = Chr(34) Then 
    strPath = Mid(strPath, 2, Len(strPath) - 2) 
End If 
strFilename = Dir$(strPath & "*.doc") 
While Len(strFilename) <> 0 
    Set oDoc = Documents.Open(strPath & strFilename) 
    strDocName = ActiveDocument.FullName 
    intPos = InStrRev(strDocName, ".") 
    strDocName = Left(strDocName, intPos - 1) 
    strDocName = strDocName & ".docx" 
    oDoc.SaveAs FileName:=strDocName, _ 
     FileFormat:=wdFormatDocumentDefault 
    oDoc.Close SaveChanges:=wdDoNotSaveChanges 
    strFilename = Dir$() 
Wend 
End Sub 
在任何轉換

這一次也沒有成功至今:

Option Explicit 
Sub ChangeDocsToTxtOrRTFOrHTML() 
'with export to PDF in Word 2007 
    Dim fs As Object 
    Dim oFolder As Object 
    Dim tFolder As Object 
    Dim oFile As Object 
    Dim strDocName As String 
    Dim intPos As Integer 
    Dim locFolder As String 
    Dim fileType As String 
    On Error Resume Next 
    locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs") 
    Select Case Application.Version 
     Case Is < 12 
      Do 
       fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT")) 
      Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML") 
     Case Is >= 12 
      Do 
       fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT")) 
      Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF") 
    End Select 
    Application.ScreenUpdating = False 
    Set fs = CreateObject("Scripting.FileSystemObject") 
    Set oFolder = fs.GetFolder(locFolder) 
    Set tFolder = fs.CreateFolder(locFolder & "Converted") 
    Set tFolder = fs.GetFolder(locFolder & "Converted") 
    For Each oFile In oFolder.Files 
     Dim d As Document 
     Set d = Application.Documents.Open(oFile.Path) 
     strDocName = ActiveDocument.Name 
     intPos = InStrRev(strDocName, ".") 
     strDocName = Left(strDocName, intPos - 1) 
     ChangeFileOpenDirectory tFolder 
     Select Case fileType 
     Case Is = "TXT" 
      strDocName = strDocName & ".txt" 
      ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText 
     Case Is = "RTF" 
      strDocName = strDocName & ".rtf" 
      ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF 
     Case Is = "HTML" 
      strDocName = strDocName & ".html" 
      ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML 
     Case Is = "PDF" 
      strDocName = strDocName & ".pdf" 

      ' *** Word 2007 users - remove the apostrophe at the start of the next line *** 
      'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF 

     End Select 
     d.Close 
     ChangeFileOpenDirectory oFolder 
    Next oFile 
    Application.ScreenUpdating = True 
End Sub 

回答

1

我將介紹一種方法,使用VBA腳本,做你想要什麼,而不必使用Word的內置的「自恢復文本任何文件「模式功能。

它將一個目錄中的每個.doc/.docx轉換爲.txt,但可用於轉換爲父應用程序支持的任何其他格式(我使用Word 2010測試過)。具體如下:

'------------ VBA script start ------------- 
Sub one1() 
Set fs = CreateObject("Scripting.FileSystemObject") 
Set list1 = fs.GetFolder(ActiveDocument.Path) 
For Each fl In list1.files 
    If InStr(fl.Type, "Word") >= 1 And Not fl.Path = ActiveDocument.Path & "\" & ActiveDocument.Name Then 
    Set wordapp = CreateObject("word.Application") 
    Set Doc1 = wordapp.Documents.Open(fl.Path) 
    'wordapp.Visible = True 
    Doc1.SaveAs2 FileName:=fl.Name & ".txt", fileformat:=wdFormatText 
    wordapp.Quit 
    End If 
Next 
End Sub 
'------------ VBA script start ------------- 

保存爲PDF格式,請使用

Doc1.SaveAs2 FileName:=fl.Name & ".pdf", fileformat:=wdFormatPDF 

代替

保存爲RTF,使用

Doc1.SaveAs2 FileName:=fl.Name & ".rtf", fileformat:=wdFormatRTF 

代替

,或者說, HTML:

Doc1.SaveAs2 FileName:=fl.Name & ".html", fileformat:=wdFormatHTML 

等等。

,我沒有理會檢查,因爲他們是無害的一些缺點:

  • 在執行的最後一個錯誤信息彈出,但沒有任何結果。

  • 它試圖打開自己,因爲它是一個文檔本身內的VBA腳本,它是一個文檔開啓者腳本。然後你必須指示'他'在彈出消息時以文字方式打開它。

  • 它會將所有文檔保存到C:\ users \ username \ Documents中,而不是從其中執行的文檔,在大多數情況下會更好。

  • 處理速度慢,在大多數普通個人電腦中預計會有2-3檔/秒的速度。