一、简介

这里简单介绍一下各个工具的使用场景,一般用mysql,redis,mongodb做存储层,hadoop,spark做大数据分析。

  • mysql适合结构化数据,类似excel表格一样定义严格的数据,用于数据量中,速度一般支持事务处理场合

  • redis适合缓存内存对象,如缓存队列,用于数据量小,速度快不支持事务处理高并发场合

  • mongodb,适合半结构化数据,如文本信息,用于数据量大,速度较快不支持事务处理场合

  • hadoop是个生态系统,上面有大数据分析很多组件,适合事后大数据分析任务

  • spark类似hadoop,偏向于内存计算,流计算,适合实时半实时大数据分析任务

移动互联网及物联网让数据呈指数增长,NoSql大数据新起后,数据存储领域发展很快,似乎方向都是向大数据,内存计算,分布式框架,平台化发展,出现不少新的方法,普通应用TB,GB级别达不到PB级别的数据存储,用mongodb,mysql就够了,hadoop,spark这类是航母一般多是大规模应用场景,多用于事后分析统计用,如电商的推荐系统分析系统。IAO

看标题,这里是不是跑题了呢,显然不是,了解一下mongodb在存储中的位置还是非常有必要的,explain 和 hint 一看就知道是从mysql借鉴过来的(猜的),实际就是检测查询语句的性能和使用强制索引

二、explain

先写入测试数据

db.test.insertMany([
{ "_id" : 1, "a" : "f1", b: "food", c: 500 },
{ "_id" : 2, "a" : "f2", b: "food", c: 100 },
{ "_id" : 3, "a" : "p1", b: "paper", c: 200 },
{ "_id" : 4, "a" : "p2", b: "paper", c: 150 },
{ "_id" : 5, "a" : "f3", b: "food", c: 300 },
{ "_id" : 6, "a" : "t1", b: "toys", c: 500 },
{ "_id" : 7, "a" : "a1", b: "apparel", c: 250 },
{ "_id" : 8, "a" : "a2", b: "apparel", c: 400 },
{ "_id" : 9, "a" : "t2", b: "toys", c: 50 },
{ "_id" : 10, "a" : "f4", b: "food", c: 75 }]);

写入成功返回值

{
"acknowledged" : true,
"insertedIds" : [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10
]
}

开始查询

> db.test.find();
{ "_id" : 1, "a" : "f1", "b" : "food", "c" : 500 }
{ "_id" : 2, "a" : "f2", "b" : "food", "c" : 100 }
{ "_id" : 3, "a" : "p1", "b" : "paper", "c" : 200 }
{ "_id" : 4, "a" : "p2", "b" : "paper", "c" : 150 }
{ "_id" : 5, "a" : "f3", "b" : "food", "c" : 300 }
{ "_id" : 6, "a" : "t1", "b" : "toys", "c" : 500 }
{ "_id" : 7, "a" : "a1", "b" : "apparel", "c" : 250 }
{ "_id" : 8, "a" : "a2", "b" : "apparel", "c" : 400 }
{ "_id" : 9, "a" : "t2", "b" : "toys", "c" : 50 }
{ "_id" : 10, "a" : "f4", "b" : "food", "c" : 75 }
> db.test.find().count();
10
> db.test.find({ c: { $gte: 100, $lte: 200 }}).count()
3
> db.test.find({ c: { $gte: 100, $lte: 200 }}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 0,
"totalDocsExamined" : 10,
"executionStages" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 12,
"advanced" : 3,
"needTime" : 8,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 10
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

看一下几个关键词

"stage" : "COLLSCAN",

"nReturned" : 3,

"totalDocsExamined" : 10,

全部扫描,不走索引,这里只是演示,所以数据量比较少,如果数据量多起来这样查询将会很慢,甚至会卡死

COLLSCAN

这个是什么意思呢? 如果你仔细一看,应该知道就是CollectionScan,就是所谓的“集合扫描”,对不对,看到集合扫描是不是就可以直接map到数据库中的table scan/heap scan呢??? 是的,这个就是所谓的性能最烂最无奈的由来。

nReturned

这个很简单,就是所谓的numReturned,就是说最后返回的num个数,从图中可以看到,就是最终返回了三条。。。

docsExamined

那这个是什么意思呢??就是documentsExamined,检查了10个documents。。。而从返回上面的nReturned。

创建索引并查询

> db.test.createIndex({ c:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.test.find({ c: { $gte: 100, $lte: 200 }}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 3,
"totalDocsExamined" : 3,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 4,
"advanced" : 3,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 3,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 4,
"advanced" : 3,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
},
"keysExamined" : 3,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

 再看看上面几个关键词

"stage" : "IXSCAN"

"totalDocsExamined" : 3,

瞬间就少了,这样查询时间也会大大减少

三、hint

这时一个很好玩的一个东西,就是用来force mongodb to excute special index,对吧,为了方便演示,我们做两组复合索引,比如这次我们在c和b上构建一下:

创建索引

> db.test.createIndex({ c:1,b:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
> db.test.createIndex({ b:1,c:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 3,
"numIndexesAfter" : 4,
"ok" : 1
}

  hint查询

> db.test.find({ c: { $gte: 100, $lte: 200 },b:"food"}).hint({c:1,b:1}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"b" : {
"$eq" : "food"
}
},
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 0,
"totalKeysExamined" : 3,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 10,
"works" : 3,
"advanced" : 1,
"needTime" : 1,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 10,
"works" : 3,
"advanced" : 1,
"needTime" : 1,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
},
"keysExamined" : 3,
"seeks" : 2,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

 正常查询

> db.test.find({ c: { $gte: 100, $lte: 200 },b:"food"}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"b" : {
"$eq" : "food"
}
},
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : 1,
"c" : 1
},
"indexName" : "b_1_c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"food\", \"food\"]"
],
"c" : [
"[100.0, 200.0]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
}
}
},
{
"stage" : "FETCH",
"filter" : {
"b" : {
"$eq" : "food"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
}
}
}
]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 0,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"b" : 1,
"c" : 1
},
"indexName" : "b_1_c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"food\", \"food\"]"
],
"c" : [
"[100.0, 200.0]"
]
},
"keysExamined" : 1,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

主要对比的还是:

"totalKeysExamined" : 3,
"totalDocsExamined" : 1,

"totalKeysExamined" : 1,
"totalDocsExamined" : 1,

是不是比较有意思,有时候monogdb并不会,走你想要的索引,当你创建多个联合索引的时候,情况就比较明显了

MongoDB中的explain和hint提的使用的更多相关文章

  1. mongodb之使用explain和hint性能分析和优化

    当你第一眼看到explain和hint的时候,第一个反应就是mysql中所谓的这两个关键词,确实可以看出,这个就是在mysql中借鉴过来的,既然是借鉴 过来的,我想大家都知道这两个关键字的用处,话不多 ...

  2. MongoDB的学习--explain()和hint()

    Explain 从之前的文章中,我们可以知道explain()能够提供大量与查询相关的信息.对于速度比较慢的查询来说,这是最重要的诊断工具之一.通过查看一个查询的explain()输出信息,可以知道查 ...

  3. 在MongoDB中执行查询、创建索引

    1. MongoDB中数据查询的方法 (1)find函数的使用: (2)条件操作符: (3)distinct找出给定键所有不同的值: (4)group分组: (5)游标: (6)存储过程. 文档查找 ...

  4. 在MongoDB中执行查询与创建索引

    实验目的: (1)掌握MongoDB中数据查询的方法: (2)掌握MongoDB中索引及其创建: 实验内容: 一. MongoDB中数据查询的方法: (1)find函数的使用: (2)条件操作符: a ...

  5. mongodb中的排序和索引快速学习

    在mongodb中,排序和索引其实都是十分容易的,先来小结下排序: 1 先插入些数据    db.SortTest.insert( { name : "Denis", age : ...

  6. MongoDB中聚合工具Aggregate等的介绍与使用

    Aggregate是MongoDB提供的众多工具中的比较重要的一个,类似于SQL语句中的GROUP BY.聚合工具可以让开发人员直接使用MongoDB原生的命令操作数据库中的数据,并且按照要求进行聚合 ...

  7. MongoDB中的聚合操作

    根据MongoDB的文档描述,在MongoDB的聚合操作中,有以下五个聚合命令. 其中,count.distinct和group会提供很基本的功能,至于其他的高级聚合功能(sum.average.ma ...

  8. MongoDB 大数据技术之mongodb中在嵌套子文档的文档上面建立索引

    一.给collection objectid赋自定义的值 MongoDB Enterprise > db.testid.insert({_id:{imsi:"4567890123&qu ...

  9. MongoDB 索引 和 explain 的使用

    索引基本使用 索引是对数据库表中一列或多列的值进行排序的一种结构,可以让我们查询数据库变得 更快.MongoDB 的索引几乎与传统的关系型数据库一模一样,这其中也包括一些基本的查 询优化技巧. 首先我 ...

随机推荐

  1. Zookeeper笔记(三)部署与启动Zookeeper

    下载zookeeper安装包 去Zookeeper官网,下载地址http://zookeeper.apache.org/releases.html,建议下载稳定版本,我下载的是zookeeper-3. ...

  2. [Wc]Dface双面棋盘()

    题解: 一道维护奇怪信息的线段树... 我刚开始看了标签想的是删去图上一个点后求连通性 发现不会 于是退化成一般图支持删除 插入 维护连通性 发现有2两种做法 1.lct维护 按照结束顺序先后排序,给 ...

  3. [ZJOI2006]书架

    链接:https://www.luogu.org/problemnew/show/P2596 题解: 写了两天的平衡树终于大概弄好了所有模板(模板不熟写错debug真是要死) 对于放在头尾,只需要删除 ...

  4. python_cookbook之路:数据结构-解压可迭代对象赋值给多个变量以及扩展的迭代解压语法(*)

    1.一一对应: >>> data = [ 'ACME', 50, 91.1, (2012, 12, 21) ] >>> name, shares, price, d ...

  5. BZOJ1295 [SCOI2009]最长距离 最短路 SPFA

    欢迎访问~原文出处——博客园-zhouzhendong 去博客园看该题解 题目传送门 - BZOJ1295 题意概括 有一块矩形土地,被分为 N*M 块 1*1 的小格子. 有的格子含有障碍物. 如果 ...

  6. 【Java】 剑指offer(14) 二进制中1的个数

    本文参考自<剑指offer>一书,代码采用Java语言. 更多:<剑指Offer>Java实现合集   题目 请实现一个函数,输入一个整数,输出该数二进制表示中1的个数.例如把 ...

  7. 为什么NULL指针也能访问成员函数?(但不能访问成员变量)

    查看更加详细的解析请参考这篇文章:http://blog.51cto.com/9291927/2148695 看一个静态绑定的例子: 1 #include <iostream> 2 3 u ...

  8. 爬虫之Resquests模块的使用(二)

    Requests Requests模块 Requests模块是一个用于网络访问的模块,其实类似的模块有很多,比如urllib,urllib2,httplib,httplib2,他们基本都提供相似的功能 ...

  9. C# JSON帮助类(可互转)

    public class JsonHelper { public JsonHelper() { // // TODO: Add constructor logic here // } /// < ...

  10. Java 之递归删除目录

    Java 之递归删除目录 一.思想 必须从最里层的文件开始删除,使用递归删除. 二.源代码:RecursiveDeleteDirectory.java package cn.com.zfc.day01 ...