cx_Oracle executemany with a CLOB_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-18 18:43 出处：网络

I am attempting to parse multiple CSVs and insert their data into tables using cx_Oracle. I have no problem inserting into the tables using execute but when I try the same procedure with executemany I

with open(key,'r') as file:
    for line in file:
        data = data.split(",")
        query = "INSERT INTO " + tables[key] + " VALUES ("
        for col in range(len(data)):
            query += ":" + str(col) + ","
        query = query[:-1] + ")"            
        cursor.execute(query, data)

but when I replace it with

with open(key,'r') as file:
    list = []
    for line in file:
        data = data.split(",")
        list.append(data)
    if len(l开发者_如何学Cist) > 0:
        query = "INSERT INTO " + tables[key] + " VALUES ("
        for col in range(len(data)):
            query += ":" + str(col) + ","
        query = query[:-1] + ")"            
        cursor.prepare(query)
        cursor.executemany(None,list)

I get "ValueError: string data too large" when trying to insert into a table that has CLOB columns and the data is over 4000 bytes. Executemany works great when the table doesn't have a CLOB column. Is there a way I can tell cx_Oracle to treat the appropriate columns as CLOBs when it does executemany?

Try setting the input size for the large columns to cx_Oracle.CLOB. Might not work if you have binary data, but should work for any text you have in a CSV. The 2K value is probably lower than it needs to be.

Note that executemany seems to be a lot slower when there are CLOB columns involved, but still better than repeated executes:

def _executemany(cursor, sql, data):
    '''
    run the parameterized sql with the given dataset using cursor.executemany 
    if any column contains string values longer than 2k, use CLOBS to avoid "string
    too large" errors.

    @param sql parameterized sql, with parameters named according to the field names in data
    @param data array of dicts, one per row to execute.  each dict must have fields corresponding
                to the parameter names in sql
    '''
    input_sizes = {}
    for row in data:
        for k, v in row.items():
            if isinstance(v, basestring) and len(v) > 2000:
                input_sizes[k] = cx_Oracle.CLOB
    cursor.setinputsizes(**input_sizes)
    cursor.executemany(sql, data)