2016-01-12 44 views
0

我收到了一張工作手冊,其中包含兩張以力量爲中心的表格(一張大約1毫米的行,另外一張20毫米的行)。我想把這個搞砸(因爲任何事情都是真的 - 但是我們可以說一個CSV),以便我可以在R + PostGreSQL中使用它。來自Power Pivot的折扣數據(「Item.data」)

我無法導出到Excel表格,因爲有超過100萬行;和複製粘貼數據只適用於我選擇大約200,000行。所以我有點卡住了!我試圖將xlsx轉換爲zip文件,並在記事本++中打開「item.data」文件,但是它在某些方面被加密。

我將不勝感激任何解決方案(樂於使用VBA,Python和SQL)

編輯:我把一些VBA其周邊的0.5磨機排工程確定,但打破了17個工廠排檔:

Public Sub CreatePowerPivotDmvInventory() 
    Dim conn As ADODB.Connection 
    Dim sheet As Excel.Worksheet 
    Dim wbTarget As Workbook 
    On Error GoTo FailureOutput 

    Set wbTarget = ActiveWorkbook 
    wbTarget.Model.Initialize 

    Set conn = wbTarget.Model.DataModelConnection.ModelConnection.ADOConnection 

    ' Call function by passing the DMV name 
    ' E.g. Partners 
    WriteDmvContent "Partners", conn 

    MsgBox "Finished" 
    Exit Sub 

FailureOutput: 
    MsgBox Err.Description 
End Sub 

Private Sub WriteDmvContent(ByVal dmvName As String, ByRef conn As ADODB.Connection) 
    Dim rs As ADODB.Recordset 
    Dim mdx As String 
    Dim i As Integer 

    mdx = "EVALUATE " & dmvName 

    Set rs = New ADODB.Recordset 
    rs.ActiveConnection = conn 
    rs.Open mdx, conn, adOpenForwardOnly, adLockOptimistic 

    ' Setup CSV file (improve this code) 
    Dim myFile As String 
    myFile = "H:\output_table_" & dmvName & ".csv" 
    Open myFile For Output As #1 

    ' Output column names 
    For i = 0 To rs.Fields.count - 1 
     If i = rs.Fields.count - 1 Then 
      Write #1, rs.Fields(i).Name 
     Else 
      Write #1, rs.Fields(i).Name, 
     End If 
    Next i 

    ' Output of the query results 
    Do Until rs.EOF 
     For i = 0 To rs.Fields.count - 1 
      If i = rs.Fields.count - 1 Then 
       Write #1, rs.Fields(i) 
      Else 
       Write #1, rs.Fields(i), 
      End If 
     Next i 
     rs.MoveNext 
    Loop 
    Close #1 
    rs.Close 
    Set rs = Nothing 

    Exit Sub 

FailureOutput: 
    MsgBox Err.Description 
End Sub 

回答

1

DAX Studio將允許您查詢Excel工作簿中的數據模型並輸出爲各種格式,包括平面文件。

你需要查詢就是:

EVALUATE 
<table name> 
+0

謝謝!但是,當我運行「EVALUATE表」時,我得到的命令(幾秒鐘後)「服務器發送了無法識別的響應」錯誤。如果它有幫助,我用一些適用於一個表的VBA更新了我的原始文章 – mptevsion

+0

對於其他大表 - 我選擇了一個文本輸出 - 但它給了我錯誤「Memory error:Allocation failure ...」。我想它首先將整個模型加載到RAM中,然後導出?而不是使用生成器來獲取每行 – mptevsion

+0

Tabular引擎中沒有行的真實物理概念 - 它是一個列存儲引擎。模型生存在RAM中壓縮。爲了評估表格,它必須實現整個表格,並在此過程中對其進行解壓縮。如果您有一個能夠很好地分割表格的字段,可以使用CALCULATETABLE(

,
[] =「some literal」)來獲取一部分。不平等也起作用。你一次可以做幾百萬行。 RAM限制更適用於32位Excel,該限制僅限於2GB的工作集。也許抓住一個更大的盒子? – greggyb

0

我已經找到了工作(VBA)解決方案[但greggy的也對我的作品呢!] - >我的桌子太大,在一個出口大塊,所以我循環過濾'月'。這似乎工作,併產生一個1.2 GB的CSV後,我將所有在一起:

Function YYYYMM(aDate As Date) 
    YYYYMM = year(aDate) * 100 + month(aDate) 
End Function 

Function NextYYYYMM(YYYYMM As Long) 
    If YYYYMM Mod 100 = 12 Then 
     NextYYYYMM = YYYYMM + 100 - 11 
    Else 
     NextYYYYMM = YYYYMM + 1 
    End If 
End Function 

Public Sub CreatePowerPivotDmvInventory() 
    Dim conn As ADODB.Connection 
    Dim tblname As String 
    Dim wbTarget As Workbook 
    On Error GoTo FailureOutput 

    Set wbTarget = ActiveWorkbook 
    wbTarget.Model.Initialize 

    Set conn = wbTarget.Model.DataModelConnection.ModelConnection.ADOConnection 

    ' Call function by passing the DMV name 
    tblname = "table1" 
    WriteDmvContent tblname, conn 

    MsgBox "Finished" 
    Exit Sub 

FailureOutput: 
    MsgBox Err.Description 
End Sub 

Private Sub WriteDmvContent(ByVal dmvName As String, ByRef conn As ADODB.Connection) 
    Dim rs As ADODB.Recordset 
    Dim mdx As String 
    Dim i As Integer 

    'If table small enough: 
    'mdx = "EVALUATE " & dmvName 

    'Other-wise filter: 
    Dim eval_field As String 
    Dim eval_val As Variant 

    'Loop through year_month 
    Dim CurrYM As Long, LimYM As Long 
    Dim String_Date As String 
    CurrYM = YYYYMM(#12/1/2000#) 
    LimYM = YYYYMM(#12/1/2015#) 
    Do While CurrYM <= LimYM 

     String_Date = CStr(Left(CurrYM, 4)) + "-" + CStr(Right(CurrYM, 2)) 
     Debug.Print String_Date 

     eval_field = "yearmonth" 
     eval_val = String_Date 
     mdx = "EVALUATE(CALCULATETABLE(" & dmvName & ", " & dmvName & "[" & eval_field & "] = """ & eval_val & """))" 
     Debug.Print (mdx) 

     Set rs = New ADODB.Recordset 
     rs.ActiveConnection = conn 
     rs.Open mdx, conn, adOpenForwardOnly, adLockOptimistic 

     ' Setup CSV file (improve this code) 
     Dim myFile As String 
     myFile = "H:\vba_tbl_" & dmvName & "_" & eval_val & ".csv" 
     Debug.Print (myFile) 
     Open myFile For Output As #1 

     ' Output column names 
     For i = 0 To rs.Fields.count - 1 
      If i = rs.Fields.count - 1 Then 
       Write #1, """" & rs.Fields(i).Name & """" 
      Else 
       Write #1, """" & rs.Fields(i).Name & """", 
      End If 
     Next i 

     ' Output of the query results 
     Do Until rs.EOF 
      For i = 0 To rs.Fields.count - 1 
       If i = rs.Fields.count - 1 Then 
        Write #1, """" & rs.Fields(i) & """" 
       Else 
        Write #1, """" & rs.Fields(i) & """", 
       End If 
      Next i 
      rs.MoveNext 
     Loop 

    CurrYM = NextYYYYMM(CurrYM) 
    i = i + 1 

    Close #1 
    rs.Close 
    Set rs = Nothing 
    Loop 

    Exit Sub 

FailureOutput: 
    MsgBox Err.Description 
End Sub