mongooplogsync工具已升级为mongosync,支持全量和增量同步。
http://nosqldb.org/p/5173d275cbce24580a033bd8

MongoDB复制集和分片都是内部成员之间数据同步,但是不同复制集或者不同分片之间无法做数据同步。
在做MongoDB复制集迁移时,现在方法在不停业务时无法做到两个复制集之间数据同步。根据MongoDB同步
内部原理
通过解析oplog来达到同步数据的目的。
步骤为:
1.读取源复制集oplog
2.replay oplog到目标复制集。
MongoDB 2.2版本以后有个mongooplog工具,但不好用,—from参数后面不支持认证。对于两个认证的复制集之间做同步无法使用。
官方文档http://docs.mongodb.org/manual/reference/mongooplog/
类似工具
python版mongooplog类似工具:
https://pypi.python.org/pypi/mongooplog-alt/0.4.1
github地址:https://github.com/publishthis/mongooplog-alt
这里也有一个:
https://github.com/RedBeard0531/mongo-oplog-watcher

mongooplogsync功能及使用
为什么需要mongooplogsync?
由于官方版本功能简单,且不易使用。
下载: http://pan.baidu.com/share/link?shareid=412035&uk=3121017937

mongooplogsync典型使用场景
1.实时迁移,尤其是从一个集群迁移到另一个集群,或者master-slave架构迁移到replica sets架构
2.实时同步,比如备份到其他集群,多对一备份。

mongooplogsync特点及功能增强:
1.非常方便易用;
2.支持类似tail -f效果,即使在同步过程中及以后新产生的数据也能同步到目标库;
3.支持MongoDB 1.8.x,MongoDB 2.0.x,MongoDB 2.4.x版本的同步,支持master-slave到replica sets架构的实时同步(在有类似架构迁移的时候非常有用);
4.支持多对一同步,通过正则匹配源库多个集合,同步到目标库一个集合,比如将源库按年月后缀的集合同步到目标库一个集合,-n选项。

迁移方法:
1.直接复制源MongoDB库数据文件或者使用mongodump/mongorestore工具导入目标MongoDB库,并记下复制前源MongoDB的optime时间戳(rs.status()查看optime即可)
2.使用mongooplogsync同步在1过程中以及2步骤自身过程中的增加数据。

支持平台:
x86-64 centos 6.2,centos 5.4需要升级glibc等库
mongooplogsync使用方法:

[root@~]# /root/ccj/bin/mongooplogsync -h 172.17.4.91:27017 -u ccj -p 123 --to 172.17.4.91:27019  --tu ccj --tp 123 -d test -c t -s 1365819288000,17472
connected to: 172.17.4.91:27017
Sat Apr 13 18:53:41 source db connected ok
Sat Apr 13 18:53:41 target db connected ok
Sat Apr 13 18:53:41 begin to sync data...
Sat Apr 13 18:53:42 test.t:200000 rows data to sync...
Sat Apr 13 18:53:44 test.t:20000 rows data are synced.
Sat Apr 13 18:53:46 test.t:40000 rows data are synced.
Sat Apr 13 18:53:48 test.t:60000 rows data are synced.
Sat Apr 13 18:53:50 test.t:80000 rows data are synced.
Sat Apr 13 18:53:52 test.t:100000 rows data are synced.
Sat Apr 13 18:53:53 test.t:120000 rows data are synced.
Sat Apr 13 18:53:55 test.t:140000 rows data are synced.
Sat Apr 13 18:53:57 test.t:160000 rows data are synced.
Sat Apr 13 18:53:59 test.t:180000 rows data are synced.
Sat Apr 13 18:54:01 test.t:200000 rows data are synced.
Sat Apr 13 18:54:06 test.t:200000 rows data are synced,waiting for new data...^_^
Sat Apr 13 18:54:11 test.t:200000 rows data are synced,waiting for new data...^_^

其中-h或者—host是源库地址,可在-h参数后直接加上port更方便,也可以单独—port指定—port参数。
正则匹配多到一同步:
将源库task_0,task_1,task_3同步到目标库task集合

[root@~]# /root/ccj/bin/mongooplogsync -h 172.17.4.91:27017 -u ccj -p 123 --to 172.17.4.91:27019  --tu ccj --tp 123 -n test.task*,test.task -s 1365912122000,1
connected to: 172.17.4.91:27017
Sun Apr 14 12:29:30 source db connected ok
Sun Apr 14 12:29:30 target db connected ok
Sun Apr 14 12:29:30 begin to sync data...
Sun Apr 14 12:29:31 test.task:200003 rows data to sync...
Sun Apr 14 12:29:34 test.task:20000 rows data are synced.
Sun Apr 14 12:29:36 test.task:40000 rows data are synced.
Sun Apr 14 12:29:37 test.task:60000 rows data are synced.
Sun Apr 14 12:29:39 test.task:80000 rows data are synced.
Sun Apr 14 12:29:41 test.task:100000 rows data are synced.
Sun Apr 14 12:29:43 test.task:120000 rows data are synced.
Sun Apr 14 12:29:45 test.task:140000 rows data are synced.
Sun Apr 14 12:29:46 test.task:160000 rows data are synced.
Sun Apr 14 12:29:48 test.task:180000 rows data are synced.
Sun Apr 14 12:29:50 test.task:200000 rows data are synced.
Sun Apr 14 12:29:55 test.task:200003 rows data are synced,waiting for new data...^_^
Sun Apr 14 12:30:00 test.task:200003 rows data are synced,waiting for new data...^_^
Sun Apr 14 12:30:05 test.task:200003 rows data are synced,waiting for new data...^_^

查看help选项:
[root@week1 mongodb-src-r2.2.2]# ./mongooplogsync -help
Pull oplog from source MongoDB server and replay oplog at target server.
An Improved alternative to official mongooplog utility,enjoy it!
Author:ccj,for more infomation,please visit http://nosql-db.com/topic/5165439ef7e53fe4530b88b0

options:
—help produce help message
-v [ —verbose ] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
—version print the program’s version and exit
-h [ —host ] arg mongo host to connect to ( <set name>/s1,s2 for
sets)
—port arg server port. Can also use —host hostname:port
—ipv6 enable IPv6 support (disabled by default)
-u [ —username ] arg username
-p [ —password ] arg password
—dbpath arg directly access mongod database files in the given
path, instead of connecting to a mongod server -
needs to lock the data directory, so cannot be used
if a mongod is currently accessing the same path
—directoryperdb if dbpath specified, each db is in a separate
directory
—journal enable journaling
-d [ —db ] arg database to use
-c [ —collection ] arg collection to use (some commands)
-f [ —fields ] arg comma separated list of field names e.g. -f name,age
—fieldFile arg file with fields names - 1 per line
—to arg target host:port
—tu arg target username
—tp arg target password
-n [ —nsregex ] arg comma separated pair of source ns regex and target
ns e.g. -n test.t*,test.t
-s [ —optimestart ] arg optime start timestamp value from rs.status(),such
as -s 1365819288000,17472

========================历史版本======================================
demo程序,需要进一步完善。效果如下:
同步源复制集test.t集合自optime 1364868866000 1000以后的数据到目标复制集。

demo 0.1:

[root@~]# ./mongosync 172.17.4.91::27017 ccj 123 172.17.4.91::27019 ccj 123 test t 1364868866000 1000
    source db connected ok
    target db connected ok
    syncing data......
    100000 rows data are synced

demo0.2:

[root[@association11](/user/association11) tmp]#  ./mongooplogsync  172.17.4.91::27017 ccj 123 172.17.4.91::27019 ccj 123 test t 1364868866000 1
source db connected ok
target db connected ok
begin to sync data......
test.t:1000002 rows data to sync...
test.t:100000 rows data are synced.
test.t:200000 rows data are synced.
test.t:300000 rows data are synced.
test.t:400000 rows data are synced.
test.t:500000 rows data are synced.
test.t:600000 rows data are synced.
test.t:700000 rows data are synced.
test.t:800000 rows data are synced.
test.t:900000 rows data are synced.
test.t:1000000 rows data are synced.
test.t:1000002 rows data are synced,waiting for new data#^_^
test.t:1000002 rows data are synced,waiting for new data#^_^
test.t:1000002 rows data are synced,waiting for new data#^_^

ccj 于 2015-10-05 20:45 修改
0 回复
需要 登录 后方可回复, 如果你还没有账号你可以 注册 一个帐号。