Development of large-scale site performance optimization method from LiveJournal background
A LiveJournal course of development
is a project in the 99 years began in the campus, a few people do as a hobby such an application, in order to achieve the following functions:
- Blog, forum
- Social network, find friends
- Polymerization article polymerization of friends
LiveJournal uses a lot of open source software, even if it itself is an open source software.
After on-line, LiveJournal very rapid growth:
- April 2004: 280 million registered users.
- April 2005: 680 million registered users.
- August 2005: 790 million registered users.
- Reached a thousand times per second page request processing.
- A large number of MySQL server.
- Use a lot of common components.
LiveJournal architecture status quo Profile
Third, from LiveJournal Development in learning
LiveJournal to 100 servers from a server development, which has experienced numerous pain, but also worked out a solution to these problems, through LiveJournal learning, allows us to avoid LJ had mistakes in the past, and good design of the system from the outset, in order to avoid the pain of late.
Let's step-by-step look at the pace of development of LJ.
1, a server
Others a donor server, LJ initially run at the top, just like Google began when broken server, worthy of our respect. At this stage, the LJ at an alarming rate familiar with the Unix operating management, server performance issues, Fortunately, you can change some minor repairs to muddle through. At this stage LJ upgrade CGI to FastCGI.
Final problem, the site is getting slower and slower, has been unable to pass too excellent to solve the point, you need more servers, then LJ began offering paid services may want the money to buy a new server to resolve then predicament.
There is no doubt that when LJ there is a huge single point, everything in server tin box filled.
2, two servers
Earned money paid service LJ bought two servers: one called Kenny Dell 6U machine is used to provide Web services, called Cartman Dell 6U server used to provide database services.
LJ have a larger disk, more computing resources. But at the same time, the network structure is very simple, each machine two network cards, Cartman Kenny intranet MySQL database services.
Temporary solution to the problem of the load, a new problem has emerged:
- A single point into a two and a single point.
- No cold backup or hot backup.
- Site slow problems began to appear, no way, grow too fast.
- Web server CPU limit is reached, the Web server.
3, 4 servers
Bought two, Kyle and Stan, this is 1U, are used to provide Web services. LJ, a total of 3 Web server and a database server. At this time both horizontal load 3 Web server.
LJ Kenny gateway for external mod_backhand to both horizontal load.
Then the problem has emerged:
- Single point of failure. Database for gateway Web server is a single point, once any machine problems will result in all the service is not available. Web server can be used to make the gateway quickly switch synchronization by maintaining the heartbeat, but still can not solve a single point of the database, LJ that time, did not do this.
- Website and slow, this is because the IO and database problem, the problem is how to add to the application inside the database?
4, five servers
Bought a database server. On two database servers using the database synchronization (MySQL support Master-Slave mode), the write operation all the master database (by Binlog, the write operation on the master server can quickly sync from the server), the read operation in two the database at the same time (it can be considered both horizontal load a).
Synchronize to the attention of a few things:
- Read operation database selection algorithm processing to choose a current database load lighter.
- Is only read from the database server
- Ready to deal with the delay in the synchronization process, handled properly may result in database synchronization interrupt. Only the judge can write operation, the read operation does not exist synchronization problems.
5 or more servers
Money, of course, to buy more servers. Fast deployment did not take long, they began to slow. The more Web servers, database servers, there are IO and CPU contention. So the BIG-IP load balancing solution.
6, where we are now:
Server is basically enough, but the performance is still a problem, the reason for the structure.
The structure of the database is the biggest problem. Slave mode due to the increase in the database are added to the application, so the only advantage is that the read operation is distributed to multiple machines, but such consequences is a write operation is distributed, each machine must be running the server more , the greater the waste, with the increase of the write operation, the fewer resources used to service the read operation.
Distribution from one to two
The final results
Now we find that we do not need these data in so many servers keep a copy. Have done a RAID server, database backup, so the backup is completely a waste of resources, a redundant extreme excessive. Why not the distribution of data storage?
The problem is found, start thinking about how to solve. To do now is the distribution of different user data to a different server for storage, in order to achieve the distributed storage of data, each machine only for fixed relative to the user, in order to achieve parallel architecture and good scalability .
In order to achieve user group, we need to be allocated for each user a set of tags used to mark user's data is stored in the database server in which group. Each group database consists of one master and several slave, and the slave in 2-3, in order to achieve the most rational allocation of system resources, both to ensure the distribution of the data read operation, but also avoid the excessive redundancy of data and synchronous operation of system resources excessive consumption.
User packet control is provided by a (group of) central server. All user packet information is stored in this machine, all users need to query the user group number of this machine, and then to get the data in the database group.
This user structure and the LJ architecture has very similar.
In the specific implementation, a couple of caveats:
- Do not use auto-incremented in the database group ID, in order to migrate users between the database group at a later date, in order to achieve a more reasonable I / O, disk space and load distribution.
- Userid, postid is stored in the global server, you can use the increment, the corresponding value in the database group must be subject to the value on the global server. Global server transactional database InnoDB.
- Between the database group when migrating users to be extremely careful when migrate user can not write operation.
7, Where are we now
Question:
- A global master server, hang up, then all users to register and write operations to hang.
- A master server for each database group, hang up, then the write operation of this group of users and hung.
- Database group hang from the server it will lead to other server load is too large.
Single point for Master-Slave mode, LJ adopted a Master-Master mode to resolve. Master-Master is actually artificial, not provided directly by MySQL, which is actually two machines at the same time is the Master, also is the slave, synchronized with each other.
Master-Master achieve need to pay attention to:
- A Master synchronization error recovery, it is best done automatically by the server.
- Digital distribution, write on both machines at the same time, some ID may conflict.
Solution:
- The parity assigned ID write an odd number, a machine, a machine to write even
- Allocated by the global server (LJ practice).
Master-Master mode there is a use of this method with the former compared to still maintain the synchronization of the two machines, but only one machine (read and write), rotation every night, or appear problem when switching.
8 Where are we now
Now an ad spots MyISAM vs InnoDB.
Using InnoDB:
- Support transactions
- Need to do more configuration, but it is worth more secure storage of data, as well as get a faster rate.
Use MyISAM:
- Log (LJ use it to the network access log).
- Read-only static data storage, fast enough.
- Concurrency is poor, unable to read and write data at the same time (add data can)
- MySQL non-normal shutdown or crash can cause the index error, need to use myisamchk to repair, and when access to large amount of very frequent.
9 cache
Last year, I wrote , it is a caching tool developed by the team of LJ, key-value way to store data to distributed memory. Data LJ buffer:
- 12 stand-alone server (not donated)
- 28 instances
- 30GB total capacity
- 90-93% hit rate (used squid may know, squid memory plus disk hit rate of about 70-80%)
How to create a cache strategy?
I want to cache all things? It is not possible, we only need to cache or may result in system bottleneck submission system efficiency. MySQL log analysis, we can find the cached object.
The disadvantage of the cache?
- Nothing is perfect, the cache also has drawbacks:
- Increase the amount of development, the need for caching write special code.
- Management more difficult, more people are needed to participate in system maintenance.
- Of course, large memory needed money.
Web access load balancing
At the packet level using the BIG-IP, BIG-IP does not know our internal processing mechanism, can not determine which server processing these requests. The reverse proxy does not play a role, not been fast enough, that is, up to less than the effect we want.
So, LJ the development . Features:
- Fast, small, manageable http web server / proxy
- Can be forwarded to the internal
- Using the Perl development
- Single-threaded, asynchronous, event-based, use epoll, kqueue
- Support Console management and http remote management, support for dynamic configuration loaded
- A variety of modes: web server, reverse proxy, plug-ins
- Support plugin: GIF / PNG interchangeable?
11 MogileFS
LJ use open source as the distributed file storage system. MogileFS very simple to use, its main design idea is:
- The file belongs to the class (the class is the smallest unit of replication)
- Storage location of the trace file
- Stored on different hosts
- MySQL Cluster Unified Storage distribution information
- Big Easy Inexpensive Disks
So far so many more documents can be found in the . students take this document to participate in two MySQL Con, twice OS Con, as well as numerous other meetings, selfless to share their experience, that we can learn. In web2.0 era rapid development to get more and more attention, but good design is still the basis of each application, web2.0 in the way of growth Top500 website, not because of the architecture hindered the development of the site.
Development of large-scale site performance optimization method from LiveJournal background的更多相关文章
- Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance Liquid Chromatography−Mass Spectrometry(基于超高效液相色谱-质谱法的高覆盖拟靶向脂质组学方法的开发)
文献名:Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance ...
- 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...
- Introducing DataFrames in Apache Spark for Large Scale Data Science(中英双语)
文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API——DataFrame ...
- Goal driven performance optimization
When your goal is to optimize application performance it is very important to understand what goal d ...
- Java Performance Optimization Tools and Techniques for Turbocharged Apps--reference
Java Performance Optimization by: Pierre-Hugues Charbonneau reference:http://refcardz.dzone.com/refc ...
- 论文笔记之:Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation Google 2016.10.06 官方 ...
- 快速高分辨率图像的立体匹配方法Effective large scale stereo matching
<Effective large scale stereo matching> In this paper we propose a novel approach to binocular ...
- Computer Vision_33_SIFT:Improving Bag-of-Features for Large Scale Image Search——2010
此部分是计算机视觉部分,主要侧重在底层特征提取,视频分析,跟踪,目标检测和识别方面等方面.对于自己不太熟悉的领域比如摄像机标定和立体视觉,仅仅列出上google上引用次数比较多的文献.有一些刚刚出版的 ...
- opengl performance optimization
OpenGL 性能优化 作者: Yang Jian (jyang@cad.zju.edu.cn) 日期: 2009-05-04 本文从硬件体系结构.状态机.光照.纹理.顶点数组.LOD.Cull等方面 ...
随机推荐
- jdbc java数据库连接 1)jdbc入门
之前操作数据 1)通过mysql的客户端工具,登录数据库服务器 (mysql -u root -p 密码) 2)编写sql语句 3)发送sql语句到数据库服务器执行 什么是jdbc? 使用jav ...
- 通过bitmap对100w数字进行排序去重
首先生成100w随机数,控制最大数 <?php $i = 0; do{ $i++; $num = rand(0, 999999); echo $num."\n"; }whil ...
- 17-前端开发之jQuery
什么是 jQuery ? jQuery 是一个 JavaScript 库,它极大地简化了 JavaScript 编程. jQuery是一个兼容多浏览器的javascript库,核心理念是write l ...
- em(倍)与px的区别
在国内网站中,包括三大门户,以及“引领”中国网站设计潮流的蓝色理想,ChinaUI等都是使用了px作为字体单位.只有百度好歹做了个可调的表率.而 在大洋彼岸,几乎所有的主流站点都使用em作为字体单位, ...
- [转载]jQuery中wrap、wrapAll和wrapInner用法以及区别
原文地址:jQuery中wrap.wrapAll和wrapInner用法以及区别作者:伊少君 原文: <ul> <li title='苹果'>苹果</li> ...
- js 运算符
一.算术运算符: 1.运算符: “+”:功能:对数字进行代数求和:对字符串进行连接操作:将一个数值转换为字符串(数值+空字符串). “-”:功能:对操作数进行取反操作:对数字进行减法操作:将字符串转换 ...
- 微信小程序之ES6与事项助手
由于官方IDE更新到了0.11.112301版本,移除了对Promise的支持,造成事项助手不能正常运行,解决此问题,在项目中引入第三方兼容库Bluebird支持Promise,代码已经整合到项目代码 ...
- CSS基本知识2-CSS选择
选择就是CSS定义的第一部分,可以用面向对象的模式来理解,或者声明式的面向对象. 标准选择: #.E 进阶选择:“,”分隔多个相同项,相当于类的实例. 如:#btn1,#btn2,.btn {...} ...
- 2.1 python使用MongoDB 示例代码
import pymongo client = pymongo.MongoClient('localhost', 27017) # MongoDB 客户端 walden = client['walde ...
- a版本冲刺第二天
队名:Aruba 队员: 黄辉昌 李陈辉 林炳锋 鄢继仁 张秀锋 章 鼎 学号 昨天完成的任务 今天做的任务 明天要做的任务 困难点 体会 408 学习测试文档的编写 看了构建之法的第二章和十三 ...