Small things are better
Yesterday I had fun time repairing 1.5Tb ext3 partition, containing many millions of files. Of course it should have never happened – this was decent PowerEdge 2850 box with RAID volume, ECC memory and reliable CentOS 4.4 distribution but still it did. We had “journal failed” message in kernel log and filesystem needed to be checked and repaired even though it is journaling file system which should not need checks in normal use, even in case of power failures. Checking and repairing took many hours especially as automatic check on boot failed and had to be manually restarted.
Same may happen with Innodb tables. They are designed to never crash, surviving power failures and even partial page writes but still they can get corrupted because of MySQL bugs, OS Bugs or hardware bugs, misconfiguration or failures.
Sometimes
corruption kind be mild, so ALTER TABLE to rebuild the table fixes it.
Sometimes table needs to be dropped and recovered from backup but in
certain cases you may need to reimport whole database – if corruption is
happens to be in undo tablespace or log files.
So do not forget
to have your recovery plan this kind failures. This is one thing you
better to have backups for. Backups however take time to restore,
especially if you do point in time recovery using binary log to get to
actual database state.
The good practice to approach this kind of
problem is first to have enough redundancy. I always assume any
component, such as piece of hardware or software can fail, even if this
piece of hardware has some internal redundancy by itself, such as RAID
or SAN solutions.
If you can’t afford full redundancy for
everything (and probably even if you do) the good idea is to keep your
objects smaller so if you need to do any maintenance with them it will
take less times. Smaller RAID volumes would typically rebuild faster,
smaller database size per system (yet another reason to like medium
end commodity hardware) makes it faster to recover, smaller tables
allow per table backup and recovery to happen faster.
With MySQL
and blocking ALTER TABLE there is yet another reason to keep tables
small, so you do not have to use complicated scenarios to do simple
things. Assume for example you need to add extra column to 500GB
Innodb table. It will probably take long hours or even days for ALTER
TABLE to complete and about 500GB of temporary space will be required
which you simply might not have. You can of course use MASTER-MASTER
replication and run statement on one server, switch role and then do it
on other, but if alter table takes several days do you really can
afford having no box to fall back to for such a long time ?
On
other hand if you would have 500 of 1GB tables it would be very easy –
you can simply move small pieces of data offline for a minute and alter
them live. Also all process will be much faster this way as whole
indexes will well fit in memory for such small tables.
Not to mention splitting 500 tables to several servers will likely be easy than splitting one big one.
There
are bunch of complications with many tables of course, it is not always
easy to partition your data appropriately, also code gets complicated
but for many applications it is worth the trouble
At NNSEEK
for example we have data split at 256 groups of tables. Current data
size is small enough so even single table would not be big problem but
it is much easier to write your code to handle split from very beginning
rather than try to add in later on when there are 100 helper scripts
written etc.
For the same reason I would recommend setting up
multiple virtual servers even if you work with physical one in the
beginning. Different accounts with different permissions will be good
enough. Doing so will ensure you will not have problems once you will
really need to scale to multiple servers.
随机推荐
- C++ 指针初始化要注意的地方
1. 声明多个指针的时候: int* P1,P2; 如上所示,声明的是创建一个指针P1和一个int型的变量P2.而不是声明的两个指针. 对每个指针变量名,都需要使用一个*. 在C++中,int* 是一 ...
- VM打开虚拟机文件报错
用VM打开以前的虚拟机文件报错 Cannot open the disk 'F:/****.vmdk' or one of the snapshot disks it depends on. 这种问题 ...
- 在Linux中安装和配置OpenVPN Server的最简便方法!
本文介绍了如何在基于RPM和DEB的系统中安装和配置OpenVPN服务器.我们在本文中将使用一个名为openvpn-install的脚本,它使整个OpenVPN服务器的安装和配置过程实现了自动化.该脚 ...
- android activity状态保存
一.被其他任务打断(来电话),再次打开希望保留数据 private String SAVE_INSTANCE_TAG = "ATWAL"; @Override protected ...
- 最火的.NET开源项目[转]
综合类 微软企业库 微软官方出品,是为了协助开发商解决企业级应用开发过程中所面临的一系列共性的问题, 如安全(Security).日志(Logging).数据访问(Data Access).配置管理( ...
- Python-学习-项目1-即时标记-1
买了一本Python入门,奈何看不下去,只能是先看后面的项目,看到那里不懂的时候在回去学习. 项目名字:即时标记 大致的意思就是把一个纯文本文件标记成自己想要的格式文件. 首先就是待处理文本,我找不到 ...
- 第三十篇 面向对象的三大特性之继承 supre()
继承 一 .什么是继承? 类的继承跟现实生活中的父.子.孙子.重孙子的继承关系一样,父类又称基类. Python中类的继承分为:单继承 和 多继承. # 定义父类 class ParentClass ...
- 第一篇 Charles的配置及相关使用
// Charles Proxy License // 适用于Charles任意版本的注册码,谁还会想要使用破解版呢. // Charles 4.2目前是最新版,可用. Registered Na ...
- Python升级3.6 强力Django+Xadmin打造在线教育平台
第 1 章 课程介绍 1-1 项目演示和课程介绍: 第 2 章 Windows下搭建开发环境 2-1 Pycharm.Navicat和Python解释器的安装: Pycharmhttp://www.j ...
- Assetbundle2
Assetbundle可以将Prefab封装起来,这是多么方便啊! 而且我也强烈建议大家将Prefab封装成Assetbundle,因为Prefab可以将游戏对象身上带的游戏游戏组件.游戏脚本.材质都 ...