2013-10-05 47 views
0

我有超過300個包含詞表的Word文檔,我一直在努力爲excel編寫一個VBA腳本來提取我需要的信息,而我我完全是Visual Basic的新手。我需要將文件名複製到第一個單元格,以及下面的單元格以包含我嘗試提取的信息,接着是下一個文件名,循環播放直到搜索並提取了所有單詞文檔。我嘗試了多種不同的方式,但是我能找到的最接近的代碼如下。它用於拉動零件號碼,但不是描述。它還會提取不需要在那裏的無關信息,但如果這些信息是必要的危險,我可以解決這些信息。 我有一個示例word文件(用其他信息替換了敏感信息),但我不確定如何附加單詞文檔或word文檔頁面1和2的jpeg。我知道如果你能看到它,這將是有益的,所以請讓我知道如何在這裏或者你看到它,這樣你就可以看到它。多個文件提取一個類似的詞表,從每個excel VBA

所以再次重申:

  • 我在第一個單元需要的文件名(A1)
  • 我需要一定的電池出表3中從Word文檔中脫穎而出
  • 如果儘可能在列C(C2:C?)中列B(B2:B?)和 混合字母和數字,然後在下一個 行下,下一個文件名(A?),並繼續重複。如果您對 有任何想法或建議,請告訴我。如果我不能 張貼的圖片,或實際的樣本文件,我願意 電子郵件,或任何其他方式獲得幫助在此。

這是我一直試圖操縱的代碼。我發現,這是一個形式的第一個和最後一排,我試圖得到它的工作,我的目的無濟於事:

Sub GetTablesFromWord() 

    'this Excel file must be in 
    'the same folder with the Word 
    'document files that are to be'processed.  
    Dim wApp As Word.Application 
    Dim wDoc As Word.Document 
    Dim wTable As Word.Table 
    Dim wCell As Word.Cell 
    Dim basicPath As String 
    Dim fName As String  

    Dim myWS As Worksheet 
    Dim xlCell As Range 
    Dim lastRow As Long 
    Dim rCount As Long 
    Dim cCount As Long 
    Dim RLC As Long 
    Dim CLC As Long  
    basicPath = ThisWorkbook.Path & Application.PathSeparator 
    'change the sheet name as required 
    Set myWS = ThisWorkbook.Worksheets("Sheet1") 
    'clear any/all previous data on the sheet myWS.Cells.Clear  

    '"open" Word Set wApp = CreateObject("Word.Application") 
    'get first .doc file name in the folder 
    'with this Excel file 
    fName = Dir(basicPath & "*.doc*") 
    Do While fName <> "" 
     'this puts the filename into column A to 
     'help separate the table data in Excel 
     myWS.Range("A" & Rows.Count).End(xlUp).Offset(1, 0) = _ 
     "FILE: [" & fName & "]" 
     'open the Word file 
     wApp.Documents.Open basicPath & fName 
     Set wDoc = wApp.Documents(1) 
     'if there is a table in the 
     'Word Document, work with it 
     If wDoc.Tables.Count > 0 Then 
     Set wTable = wDoc.Tables(3) 
     rCount = wTable.Rows.Count 
     cCount = wTable.Columns.Count 
      For RLC = 1 To rCount 
      lastRow = myWS.Range("A" & Rows.Count).End(xlUp).Row + 1 
      For CLC = 1 To cCount 
      'if there are merged cells in the 
      'Word table, an error will be 
      'generated - ignore the error, 
      'but also won't process the data 
      On Error Resume Next 
      Set wCell = wTable.Cell(RLC, CLC) 
      If Err <> 0 Then 
      Err.Clear 
      Else 
      If CLC = 1 Then 
      Set xlCell = myWS.Range("A" & lastRow) 
       xlCell = wCell 
      Else 
       Set xlCell = myWS.Range("B" & lastRow) 
       xlCell = wCell 
      End If 
      End If 
      On Error GoTo 0 
     Next 
     Next 
     Set wCell = Nothing 
     Set wTable = Nothing 
    End If ' end of wDoc.Tables.Count test 
    wDoc.Close False 
    Set wDoc = Nothing 
    fName = Dir() 
' gets next .doc* filename in the folder 
    Loop wApp.Quit 
    Set wApp = Nothing 
    MsgBox "Task Completed" 
End Sub 

回答

0

代碼遍歷所有的.docx文件包含在一個文件夾中,將數據提取到電子表格中,關閉單詞文檔並移至下一個文檔。 word文檔的名字被提取到列A中,並且文檔中第三個表格中的值被提取到列B中。這應該是一個很好的起點。

Sub wordScrape() 

Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object 
Dim sh1 As Worksheet 
Dim x As Integer 

FolderName = "C:\code" ' Change this to the folder containing your word documents 

Set sh1 = ThisWorkbook.Sheets(1) 
Set fso = CreateObject("Scripting.FileSystemObject") 
Set wordApp = CreateObject("Word.application") 
Set objFiles = fso.GetFolder(FolderName).Files 

x = 1 
For Each wd In objFiles 
    If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then 
     Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True) 
     sh1.Cells(x, 1) = wd.Name 
     sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(3).Cell(Row:=3, Column:=2).Range) 
     'sh1.Cells(x, 3) = ....more extracted data.... 
     x = x + 1 
    wrdDoc.Close 
    End If 

Next wd 
wordApp.Quit 
End Sub 
相關問題