2011-07-08 39 views
2

我試圖解析多個CSV並使用cx_Oracle將其數據插入到表中。我沒有問題插入到表中使用execute,但是當我嘗試與executemany相同的過程我得到一個錯誤。使用執行該作品在我的代碼是cx_Oracle執行與CLOB

with open(key,'r') as file: 
    for line in file: 
     data = data.split(",") 
     query = "INSERT INTO " + tables[key] + " VALUES (" 
     for col in range(len(data)): 
      query += ":" + str(col) + "," 
     query = query[:-1] + ")"    
     cursor.execute(query, data) 

但是當我

with open(key,'r') as file: 
    list = [] 
    for line in file: 
     data = data.split(",") 
     list.append(data) 
    if len(list) > 0: 
     query = "INSERT INTO " + tables[key] + " VALUES (" 
     for col in range(len(data)): 
      query += ":" + str(col) + "," 
     query = query[:-1] + ")"    
     cursor.prepare(query) 
     cursor.executemany(None,list) 

取代它,我得到「ValueError異常:字符串數據太大」試圖插入到具有CLOB列和一個表時數據超過4000字節。當表沒有CLOB列時,Executemany很有效。有沒有一種方法可以告訴cx_Oracle在執行任務時將相應的列視爲CLOB?

回答

3

嘗試將大列的輸入大小設置爲cx_Oracle.CLOB。如果您有二進制數據,可能無法工作,但應該適用於CSV中的任何文本。 2K的值可能低於需要的值。

注意executemany似乎是當有CLOB列參與,但仍比重複執行好慢了許多:

def _executemany(cursor, sql, data): 
    ''' 
    run the parameterized sql with the given dataset using cursor.executemany 
    if any column contains string values longer than 2k, use CLOBS to avoid "string 
    too large" errors. 

    @param sql parameterized sql, with parameters named according to the field names in data 
    @param data array of dicts, one per row to execute. each dict must have fields corresponding 
       to the parameter names in sql 
    ''' 
    input_sizes = {} 
    for row in data: 
     for k, v in row.items(): 
      if isinstance(v, basestring) and len(v) > 2000: 
       input_sizes[k] = cx_Oracle.CLOB 
    cursor.setinputsizes(**input_sizes) 
    cursor.executemany(sql, data)