0

我通过脚本的方式往数据库插入2百万条数据,

具体做法如下:获取A数据库中Alarm表的数据,每次查询1000条,然后转换下格式,插入到B库中的另一张表CommonAlarm,脚本主要内容如下:

print(“开始进行数据迁移…”);
initCountry();
var startTime = new Date().getTime();
var total = dmDb.Alarm.count();
var segmentSize = 1000;
var segment = parseInt(total / segmentSize);
var currentFinish = 0;
for (var i = 0; i < 1; i++) {
var alarmList = dmDb.Alarm.find().skip(i * segmentSize).limit(segmentSize);
var commonAlarmList = [];
alarmList.forEach(function (alarm) {
commonAlarmList.push(buildCommonAlarm(alarm));
});
alarmDb.CommonAlarm2.insertMany(commonAlarmList);
currentFinish += segmentSize;
print(“进度:” + (currentFinish / total * 100) + ” %”);
}
if (total – currentFinish > 0) {
var commonAlarmList = [];
var alarmList = dmDb.Alarm.find().skip(currentFinish);
alarmList.forEach(function (alarm) {
commonAlarmList.push(buildCommonAlarm(alarm));
});
alarmDb.CommonAlarm2.insertMany(commonAlarmList);

var endTime = new Date().getTime();

print(“总用时:” + (endTime – startTime) / 1000);

问题是插入100多万数据后,

在脚本执行的那个控制台会报

[thread1] Error: error doing query: failed: network error while attempting to run ‘find’ 的错误,

我在运行脚本时的连接数如下

[root@dm-allinone-de-aws bin]# netstat -anpt | grep 27017|wc -l
32

> db.serverStatus().connections;
{ “current” : 22, “available” : 819178, “totalCreated” : 58 }

看mongodb的日志是这样的:

一开始这个时间只有300ms,随着数据量增大,这个时间变得 越来越大

2020-07-10T09:34:52.558+0800 I COMMAND [conn26] command cloud_dm.Alarm appName: “MongoDB Shell” command: find { find: “Alarm”, filter: {}, skip: 1542000.0, limit: 1000.0, singleBatch: false } planSummary: COLLSCAN cursorid:96326131190 keysExamined:0 docsExamined:1541601 numYields:12051 nreturned:101 reslen:94337 locks:{ Global: { acquireCount: { r: 24104 } }, Database: { acquireCount: { r: 12052 } }, Collection: { acquireCount: { r: 12052 } } } protocol:op_command 1927m

搞不懂啊,这是啥问题啊。

还有我发现我要是把segment的大小调大,就是每次操作5000,10000条时,报错的时间会更早,还没到1百万数据就报错了,看网上大部分是说连接数太多,但是我看了我连接数并不多啊。

Answered question