我有這個巨大的csv文件,它是4GB,不知道有多少行,但有320列。從一個巨大的csv文件中提取字段,並將它們寫入表,文本或csv文件
因爲它不能在任何程序中打開(除了使用第三方程序將文件拆分成多個部分)我試圖找到一種方法來提取我需要的數據。我只需要大約10-15列。
我在網上看到了很多解決方案(大部分是在vbs中),但是我無法獲得任何解決方案。我會得到錯誤,我不知道vbs能夠排除故障。
誰能幫助嗎?
謝謝
PS這裏的VBS代碼,我發現和使用,我不得不沒有運氣嘗試過的一個例子。
原來的錯誤是「800a01f4變量未定義」,在網絡上建議取出OPTION EXPLICIT。一旦我這樣做,下一個錯誤是「800a01fa類未定義」。
在這兩種情況下給出錯誤的行是「設置adoJetCommand =新ADODB.Command」
Option Explicit
Dim adoCSVConnection, adoCSVRecordSet, strPathToTextfile
Dim strCSVFile, adoJetConnection,adoJetCommand, strDBPath
Const adCmdText = &H0001
' Specify path to CSV file.
strPathToTextFile = "C:\Users\natalie.rynda\Documents\Temp\RemailMatch\"
' Specify CSV file name.
strCSVFile = "NPIOld.csv"
' Specify Access database file.
strDBPath = "C:\Users\natalie.rynda\Documents\Temp\RemailMatch\NPIs.mdb"
' Open connection to the CSV file.
Set adoCSVConnection = CreateObject("ADODB.Connection")
Set adoCSVRecordSet = CreateObject("ADODB.Recordset")
' Open CSV file with header line.
adoCSVConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited"""
adoCSVRecordset.Open "SELECT * FROM " & strCSVFile, adoCSVConnection
' Open connection to MS Access database.
Set adoJetConnection = CreateObject("ADODB.Connection")
adoJetConnection.ConnectionString = "DRIVER=Microsoft Access Driver (*.mdb);" _
& "FIL=MS Access;DriverId=25;DBQ=" & strDBPath & ";"
adoJetConnection.Open
' ADO command object to insert rows into Access database.
Set adoJetCommand = New ADODB.Command
Set adoJetCommand.ActiveConnection = adoJetConnection
adoJetCommand.CommandType = adCmdText
' Read the CSV file.
Do Until adoCSVRecordset.EOF
' Insert a row into the Access database.
adoJetCommand.CommandText = "INSERT INTO NPIs " _
& "(NPI, EntityTypeCode, ReplacementNPI, EIN, MAddress1, MAddress2, MCity, MState, MZIP, SAddress1, SAddress2, SCity, SState, SZIP, ProviderEnumerationDate, LastUpdateDate, NPIDeactivationReasonCode, NPIDeactivationDate, NPIReactivationDate) " _
& "VALUES (" _
& "'" & adoCSVRecordset.Fields("NPI").Value & "', " _
& "'" & adoCSVRecordset.Fields("Entity Type Code").Value & "', " _
& "'" & adoCSVRecordset.Fields("Replacement NPI").Value & "', " _
& "'" & adoCSVRecordset.Fields("Employer Identification Number (EIN)").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider First Line Business Mailing Address").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Second Line Business Mailing Address").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Mailing Address City Name").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Mailing Address State Name").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Mailing Address Postal Code").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider First Line Business Practice Location Address").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Second Line Business Practice Location Address").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Practice Location Address City Name").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Practice Location Address State Name").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Business Practice Location Address Postal Code").Value & "', " _
& "'" & adoCSVRecordset.Fields("Provider Enumeration Date").Value & "', " _
& "'" & adoCSVRecordset.Fields("Last Update Date").Value & "', " _
& "'" & adoCSVRecordset.Fields("NPI Deactivation Reason Code").Value & "', " _
& "'" & adoCSVRecordset.Fields("NPI Deactivation Date").Value & "', " _
& "'" & adoCSVRecordset.Fields("NPI Reactivation Date").Value & "')"
adoJetCommand.Execute
adoCSVRecordset.MoveNext
Loop
' Clean up.
adoCSVRecordset.Close
adoCSVConnection.Close
adoJetConnection.Close
我想補充一點,我看到這篇文章http://stackoverflow.com/questions/427488/want-vba-in-excel-to-read-very-large-csv-and-create -output-file-of-a-small-subse?rq = 1並嘗試了vbs選項(錯誤「沒有爲給出的參數給出一個值」,我不明白vba解決方案。我不只是發佈沒有先花費幾個小時搜索,並嘗試一切我可以。謝謝你! – lalachka 2012-07-27 00:37:49
謝謝你,我會解決,但我怕我的錯誤拋出之前,我甚至到了這一點 – lalachka 2012-07-27 01:25:47
我剛剛檢查並且我看不到字段不匹配 – lalachka 2012-07-27 01:29:40