Skip to content

Is it possible to continue to mirror a database to the same destination if the connection breaks? #18

@kevjp

Description

@kevjp

Hi I am trying to mirror the gbCdnaInfo table which is pretty large ~40Gb by:

import cruzdb

g = cruzdb.Genome(db="hg19")

gbCdnaInfo = g.mirror(['gbCdnaInfo'], 'sqlite:////home/test/gbCdnaInfo160104.db')

When I ran the code it managed to mirror 19Gb before the connection went down. I tried to restart the above script and the error I got was

attempting to add to existing sqlite database
Mirroring gbCdnaInfo
Traceback (most recent call last):
File "mirrordatabase160104.py", line 11, in <module>
gbCdnaInfo = g.mirror(['gbCdnaInfo'], 'sqlite:////home/test/gbCdnaInfo160104.db')
File "/usr/local/lib/python2.7/dist-packages/cruzdb/__init__.py", line 97, in mirror
return mirror(self, tables, dest_url)
File "/usr/local/lib/python2.7/dist-packages/cruzdb/mirror.py", line 110, in mirror
destination.execute(ins, records)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 991, in execute
bind, close_with_result=True).execute(clause, params or {})
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 729, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/sql/elements.py", line 321, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 958, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1160, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 928, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.7-py2.7-linux-x86_64.egg/sqlalchemy/engine/default.py", line 433, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.IntegrityError: (IntegrityError) UNIQUE constraint failed: gbCdnaInfo.id u'INSERT INTO "gbCdnaInfo" (id, acc, version, moddate, type, direction, source, organism, library, "mrnaClone", sex, tissue, development, cell, cds, keyword, description, "geneName", "productName", author, gi, mol) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)' ((1L, 'AB004856', 1, '2008-11-23', 'mRNA', '0', 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 3036932L, 'mRNA'), (2L, 'AB005263', 1, '2008-11-23', 'mRNA', '0', 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 2L, 2L, 1L, 3036936L, 'mRNA'), (3L, 'AB011407', 1, '2008-11-03', 'mRNA', '0', 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 3L, 1L, 0L, 0L, 3L, 1L, 4190955L, 'mRNA'), (4L, 'AB012144', 1, '1999-01-24', 'mRNA', '0', 2L, 2L, 0L, 1L, 0L, 0L, 0L, 0L, 4L, 2L, 0L, 3L, 0L, 2L, 3882098L, 'mRNA'), (5L, 'AB012145', 1, '1999-04-01', 'mRNA', '0', 3L, 3L, 0L, 2L, 0L, 0L, 0L, 0L, 5L, 3L, 0L, 0L, 0L, 3L, 4730806L, 'mRNA'), (6L, 'AB017109', 1, '2006-11-27', 'mRNA', '0', 4L, 4L, 0L, 0L, 0L, 0L, 0L, 0L, 6L, 1L, 0L, 6L, 4L, 4L, 4239966L, 'mRNA'), (7L, 'AB019621', 1, '1999-07-01', 'mRNA', '0', 5L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 7L, 4L, 0L, 0L, 5L, 5L, 4586513L, 'mRNA'), (8L, 'AB026157', 2, '1999-08-01', 'mRNA', '0', 6L, 6L, 0L, 3L, 0L, 0L, 0L, 0L, 8L, 5L, 0L, 0L, 6L, 6L, 5811598L, 'mRNA')  ... displaying 10 of 20001 total bound parameter sets ...  (20000L, 'AB229080', 1, '2007-05-15', 'mRNA', '0', 449L, 445L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 178L, 0L, 0L, 0L, 451L, 84576905L, 'mRNA'), (20001L, 'AB229081', 1, '2007-05-15', 'mRNA', '0', 449L, 445L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 178L, 0L, 0L, 0L, 451L, 84576906L, 'mRNA'))

This suggests that the mirror function is trying to pick up where it left off, do you know what might be the problem here. The second time round the error indicates a UNIQUE constraint fail but the original reason for the fail was

sqlalchemy.exc.OperationalError: (OperationalError) (2013, 'Lost connection to MySQL server during query') 'SELECT `gbCdnaInfo`.id, `gbCdnaInfo`.acc, `gbCdnaInfo`.version, `gbCdnaInfo`.moddate, `gbCdnaInfo`.type, `gbCdnaInfo`.direction, `gbCdnaInfo`.source, `gbCdnaInfo`.organism, `gbCdnaInfo`.library, `gbCdnaInfo`.`mrnaClone`, `gbCdnaInfo`.sex, `gbCdnaInfo`.tissue, `gbCdnaInfo`.development, `gbCdnaInfo`.cell, `gbCdnaInfo`.cds, `gbCdnaInfo`.keyword, `gbCdnaInfo`.description, `gbCdnaInfo`.`geneName`, `gbCdnaInfo`.`productName`, `gbCdnaInfo`.author, `gbCdnaInfo`.gi, `gbCdnaInfo`.mol \nFROM `gbCdnaInfo` \n LIMIT %s, %s' (48032000, 8000)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions