logo

版本:1.4.0 环境:复制集架构 现象:主库响应极慢,mongo都登陆不进去。从库上看到log ```javascript Tue Mar 11 21:03:45.379 [rsBackgroundSync] Socket recv() timeout ip:port Tue Mar 11 21:03:45.379 [rsBackgroundSync] SocketException: remote: ip:port error: 9001 socket exception [RECV_TIMEOUT] server [ip:port]

处理过程:
重启主库,新主库自动切换到原来从库。重启过程中发现rollback导致无法同步了,变为FATAL状态。
```javascript
Tue Mar 11 21:34:31.632 [rsBackgroundSync] replSet syncing to: ip:port
Tue Mar 11 21:34:31.642 [rsBackgroundSync] Rollback needed! Our GTIDprimary: 2 secondary: 47478127 remote GTID: primary: 3 secondary: 0. Attempting rollback.
Tue Mar 11 21:34:31.698 [rsBackgroundSync] Rollback takes us too far back, throwing exception. remoteTS: 1394542995394 oplogTS: 1394544846610
Tue Mar 11 21:34:31.698 [rsBackgroundSync] Caught DBException during rollback 0 Failed to rollback oplog operation: replSet rollback too long a time period for a rollback (at least 30 minutes).
Tue Mar 11 21:34:31.698 [rsBackgroundSync] Caught a RollbackOplogException during rollback, going fatal
Tue Mar 11 21:34:31.698 [rsBackgroundSync] replSet error fatal, stopping replication

没有很好的处理方式,重做,重新同步。

3 回复
ccj
#1 ccj • 2014-03-12 18:41

rollback失败原因是oplog时间差超过半小时(1800s)

Hisoka-J
#2 Hisoka-J • 2014-03-25 12:13

这个复制故障是怎么导致的?莫名的就故障了?

ccj
#3 ccj • 2014-03-25 12:58

@Hisoka-J 是的。根源没查出来。

需要 登录 后方可回复, 如果你还没有账号你可以 注册 一个帐号。