Yesterday I had fun time repairing 1.5Tb ext3 partition, containing many millions of files. Of course it should have never happened – this was decent PowerEdge 2850 box with RAID volume, ECC memory and reliable CentOS 4.4 distribution but still it did. We had “journal failed” message in kernel log and filesystem needed to be checked and repaired even though it is journaling file system which should not need checks in normal use, even in case of power failures. Checking and repairing took many hours especially as automatic check on boot failed and had to be manually restarted.

Same may happen with Innodb tables. They are designed to never crash, surviving power failures and even partial page writes but still they can get corrupted because of MySQL bugs, OS Bugs or hardware bugs, misconfiguration or failures.

Sometimes
corruption kind be mild, so ALTER TABLE to rebuild the table fixes it.
Sometimes table needs to be dropped and recovered from backup but in
certain cases you may need to reimport whole database – if corruption is
happens to be in undo tablespace or log files.

So do not forget
to have your recovery plan this kind failures. This is one thing you
better to have backups for. Backups however take time to restore,
especially if you do point in time recovery using binary log to get to
actual database state.

The good practice to approach this kind of
problem is first to have enough redundancy. I always assume any
component, such as piece of hardware or software can fail, even if this
piece of hardware has some internal redundancy by itself, such as RAID
or SAN solutions.

If you can’t afford full redundancy for
everything (and probably even if you do) the good idea is to keep your
objects smaller so if you need to do any maintenance with them it will
take less times. Smaller RAID volumes would typically rebuild faster,
smaller database size per system (yet another reason to like medium
end commodity hardware) makes it faster to recover, smaller tables
allow per table backup and recovery to happen faster.

With MySQL
and blocking ALTER TABLE there is yet another reason to keep tables
small, so you do not have to use complicated scenarios to do simple
things. Assume for example you need to add extra column to 500GB
Innodb table. It will probably take long hours or even days for ALTER
TABLE to complete and about 500GB of temporary space will be required
which you simply might not have. You can of course use MASTER-MASTER
replication and run statement on one server, switch role and then do it
on other, but if alter table takes several days do you really can
afford having no box to fall back to for such a long time ?

On
other hand if you would have 500 of 1GB tables it would be very easy –
you can simply move small pieces of data offline for a minute and alter
them live. Also all process will be much faster this way as whole
indexes will well fit in memory for such small tables.

Not to mention splitting 500 tables to several servers will likely be easy than splitting one big one.

There
are bunch of complications with many tables of course, it is not always
easy to partition your data appropriately, also code gets complicated
but for many applications it is worth the trouble

At NNSEEK
for example we have data split at 256 groups of tables. Current data
size is small enough so even single table would not be big problem but
it is much easier to write your code to handle split from very beginning
rather than try to add in later on when there are 100 helper scripts
written etc.

For the same reason I would recommend setting up
multiple virtual servers even if you work with physical one in the
beginning. Different accounts with different permissions will be good
enough. Doing so will ensure you will not have problems once you will
really need to scale to multiple servers.

 
参考:
http://www.mysqlperformanceblog.com/2006/10/08/small-things-are-better/

随机推荐

  1. Linux 内核之api_man 手册安装

    开发环境:Ubuntu18.04,虚拟机virtual box 1.安装XML格式转换 sudo apt  install xmlto 2.在内核目录执行 make mandocs  大概持续了半小时 ...

  2. 数据库中pymysql模块的使用

    pymysql 模块 使用步骤: 核心类Connect链接用和Cursor读写用 1. 与数据库服务器建立链接 2. 获取游标对象(用于发送和接收数据) 3. 用游标执行sql语句 4. 使用fetc ...

  3. 环形链表II 142 使用快慢指针(C++实现)

    /** * Definition for singly-linked list. * struct ListNode { * int val; * ListNode *next; * ListNode ...

  4. 002---time & datetime

    time & datetime 时间模块 分类 时间戳 时间字符串 时间元祖 定义 UTC:格林威治时间,世界标准时间,中国(UTC + 8) 时间戳:1970-01-01 0:0:0 开始按 ...

  5. Windows 10 下如何彻底关闭 Hyper-V 服务(翻外篇)

    原文:Windows 10 下如何彻底关闭 Hyper-V 服务(翻外篇) windows禁用/启用hyper-V,解决hyper-V与模拟器同时启用时造成冲突 我是这样解决的,以管理员身份运行命令提 ...

  6. 关于实现mybatis order by 排序传递参数实现 问题记录

    一    问题场景:本人项目纯纯的后端系统  并且项目前端采用纯纯的原生js 实现 1)表格  通过查询列表数据放入到域中  前段采用 for循环的方式实现遍历生成列表 2)分页实现本人是公司内部自定 ...

  7. 从C到C++ (1)

    从C到C++ 一. bool类型 bool取值false和true,是0和1的区别: false可以代表0,但true有很多种,并非只有1. 二. const限定符 常量在定义后就不能修改,所以定义时 ...

  8. Hystrix入门指南

    Introduction 1.Where does the name come from? hystrix对应的中文名字是“豪猪”,豪猪周身长满了刺,能保护自己不受天敌的伤害,代表了一种防御机制,这与 ...

  9. Python 3基础教程30-sys模块

    本文介绍sys模块,简单打印两个重定向输出. 目前使用机会没有,以后实际用到了,再去研究和学习.

  10. Kotlin的Reified类型:怎样在函数内使用这一类型(KAD 14)

    作者:Antonio Leiva 时间:Mar 2, 2017 原文链接:https://antonioleiva.com/reified-types-kotlin/ 对于Java开发者来说,最懊恼的 ...