Opened 14 years ago

Closed 14 years ago

#744 closed enhancement (fixed)

Make it possible to resume the migration program in case of a failure

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: critical Milestone: BASE 2.4.1
Component: migrate Version:
Keywords: Cc:

Description

The migration program can take a very long time to run if there is a lot of data on the BASE 1 server. For the installation here in Lund the expected run time is around 100+ hours. This makes the migration sensitive to random failues. In fact, we have not been able to complete a migration successfully. The best try had about one hour left before it failed with:

net.sf.basedb.core.BaseException: could not execute query using scroll
        at net.sf.basedb.core.HibernateUtil.loadIterator(HibernateUtil.java:1481)
        at net.sf.basedb.core.RawDataBatcher.<init>(RawDataBatcher.java:192)
        at net.sf.basedb.core.RawDataBatcher.getNew(RawDataBatcher.java:98)
        at net.sf.basedb.core.RawBioAssay.getRawDataBatcher(RawBioAssay.java:644)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.transferRawBioAssayData(RawBioAssayDataTransfer.java:164)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.createItem(RawBioAssayDataTransfer.java:133)
        at net.sf.basedb.clients.migrate.Transfer.runUnBatched(Transfer.java:413)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.start(RawBioAssayDataTransfer.java:104)
        at net.sf.basedb.clients.migrate.Migrater.startTransfer(Migrater.java:231)
        at net.sf.basedb.clients.migrate.Migrater.run(Migrater.java:191)
        at net.sf.basedb.clients.migrate.Migrater.main(Migrater.java:493)
Caused by: org.hibernate.exception.JDBCConnectionException: could not execute query using scroll
        at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:74)
        at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
        at org.hibernate.loader.Loader.scroll(Loader.java:2328)
        at org.hibernate.loader.hql.QueryLoader.scroll(QueryLoader.java:441)
        at org.hibernate.hql.ast.QueryTranslatorImpl.scroll(QueryTranslatorImpl.java:390)
        at org.hibernate.engine.query.HQLQueryPlan.performScroll(HQLQueryPlan.java:245)
        at org.hibernate.impl.StatelessSessionImpl.scroll(StatelessSessionImpl.java:586)
        at org.hibernate.impl.QueryImpl.scroll(QueryImpl.java:67)
        at net.sf.basedb.core.HibernateUtil.loadIterator(HibernateUtil.java:1476)
        ... 10 more
Caused by: com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **

java.io.EOFException

STACKTRACE:

java.io.EOFException
        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1956)
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2421)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2867)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
        at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1345)
        at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2326)
        at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:436)
        at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2033)
        at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1436)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1770)
        at com.mysql.jdbc.Connection.execSQL(Connection.java:3255)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1293)
        at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1428)
        at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:76)
        at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:186)
        at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
        at org.hibernate.loader.Loader.scroll(Loader.java:2293)
        at org.hibernate.loader.hql.QueryLoader.scroll(QueryLoader.java:441)
        at org.hibernate.hql.ast.QueryTranslatorImpl.scroll(QueryTranslatorImpl.java:390)
        at org.hibernate.engine.query.HQLQueryPlan.performScroll(HQLQueryPlan.java:245)
        at org.hibernate.impl.StatelessSessionImpl.scroll(StatelessSessionImpl.java:586)
        at org.hibernate.impl.QueryImpl.scroll(QueryImpl.java:67)
        at net.sf.basedb.core.HibernateUtil.loadIterator(HibernateUtil.java:1476)
        at net.sf.basedb.core.RawDataBatcher.<init>(RawDataBatcher.java:192)
        at net.sf.basedb.core.RawDataBatcher.getNew(RawDataBatcher.java:98)
        at net.sf.basedb.core.RawBioAssay.getRawDataBatcher(RawBioAssay.java:644)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.transferRawBioAssayData(RawBioAssayDataTransfer.java:164)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.createItem(RawBioAssayDataTransfer.java:133)
        at net.sf.basedb.clients.migrate.Transfer.runUnBatched(Transfer.java:413)
        at net.sf.basedb.clients.migrate.RawBioAssayDataTransfer.start(RawBioAssayDataTransfer.java:104)
        at net.sf.basedb.clients.migrate.Migrater.startTransfer(Migrater.java:231)
        at net.sf.basedb.clients.migrate.Migrater.run(Migrater.java:191)
        at net.sf.basedb.clients.migrate.Migrater.main(Migrater.java:493)


** END NESTED EXCEPTION ** 
Last packet sent to the server was 173460 ms ago.
        at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2579)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2867)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
        at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1345)
        at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2326)
        at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:436)
        at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2033)
        at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1436)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1770)
        at com.mysql.jdbc.Connection.execSQL(Connection.java:3255)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1293)
        at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1428)
        at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:76)
        at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:186)
        at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
        at org.hibernate.loader.Loader.scroll(Loader.java:2293)
        ... 16 more

The error happens randomly and seems to be caused by some kind of unstable communication with MySQL.

The best way to solve this is to make it possible to resume a failed migration from the point where it failed. This doesn't have to be a generic solution that can resume the migration from anywhere. Most time (90% or more) is spent migrating raw data. This is almost the last step and I think the only place where we need to be able to resume. Fortunately, this is also one of the easiest places to implement this. We only need to keep track of mappings between BASE 1 and BASE 2 ID:s for a few items (raw bioassays, reporters, users, groups, and maybe some more...)

Change History (4)

comment:1 Changed 14 years ago by Nicklas Nordborg

Owner: changed from everyone to Nicklas Nordborg
Status: newassigned

comment:2 Changed 14 years ago by Nicklas Nordborg

(In [3709]) References #744: Make it possible to resume the migration program in case of a failure

It should now be possible to resume the migration if there is a failure during raw data transfer.

comment:3 Changed 14 years ago by Jari Häkkinen

(In [3713]) Addresses #744. Added information about migration restart do documentation.

comment:4 Changed 14 years ago by Jari Häkkinen

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.